BRight: A Distributed System for Web lnformation lndexing and Searching
Resumo
O tamanho extraordinário e o crescimento exponencial da World-Wide Web demandam novas abordagens aos problemas de indexação e pesquisa de informação em sua estrutura. Neste artigo, discutimos as limitações da abordagem centralizada atualmente em uso, comentamos trabalhos recentes em arquiteturas distribuídas para Recuperação de Informação em Redes, e apresentamos o BRight!, um sistema distribuído para indexar e pesquisar informação na World-Wide Web. Em seguida, focalizamos o BRight! para discutir sua arquitetura e seus conceitos subjacentes de Visões de Web, fazemos comparações com outros sistemas, e mostramos sua escalabilidade em relação ao crescimento da Web. A versão atual do protótipo é apresentada, e trabalhos futuros são delineados.
Referências
Altavista (http://www.altavista.digital.com).
BellG.; Semmell,J. "On-Ramp Prospects for the Information Superhighway Dream". Communic. of the ACM, 39(7):55-61, 1996.
Berghel,H. "The Client's Side of the World-Wide Web". Communic. of the ACM, 39(1):30-40, 1996.
BolesD.; Dreger,M.; Grossjohann,K. "MeDoc Information Broker - Harnessing the Information in Literature and Full Text Databases", In Proc. of the Workshop on Nerworked Information Retrieval, Zurich, Aug. 22, 1996.
Brown,P.J. "Linking and Searching within Hypertext". Electronic Publishing, 1(1):45-53, 1988.
Buckland,M. "Searching Multiple Digital Libraries: A Design Analysis". Digital Library Project, School of Information Management and Systems, Univ. of California, Berkeley, CA, 1995 [link].
Buckland,M.; Butler,M.H.; Norgard,B.; Plaunt,C. "OASIS: Prototyping Graphical Interfaces to Networked Information", Proc. 56th Annual Meeting of the American Society for Information Science, Medford, NJ, pp. 204-210, 1993.
Callan,J.P.; Lu,Z.; Croft,W.B. "Searching Distributed Collections with Inference Networks". In Proc. 18th Annual International ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 21-28, Seatile, WA, 1995 (http://www.di.ufpe.br/~bright/artigos/FTSIRetrieval/CallanEtA195.ps.gz).
Campbell, K. "Understanding and Comparing Web Search Tools". 1996 (http://www.hamline.edu/library/links/comparisons.html).
Cooper,I. "Indexing the World", July 1994. Computing Laboratory, University of Kent, Canterbury, Kent, ENGLAND (http://stork.ukc.ac.uk/computer_science/Html/Pubs/IHC10-94).
Comnolly,D.; Soley,R.M. (Eds.) Joint W3C/OMG Workshop on Distributed Objects and Mobile Code. Sponsored by the World Wide Web Consortium and the Object Management Group. Boston, MA, 1996 (http://www.w3.org/pub/WWW/OOP/9606_Workshop).
Crowder,G.; Nicholas,C. "Resource Selection in CAFE: an Architecture for Networked Information Retrieval". In Proc. of the Workshop on Netmorked Information Retrieval, Zurich, Aug. 22, 1996.
Edwards,N. "CORBAweb - A CORBA Gateway for the Web". In Comnolly,D.; Soley,R.M. (Eds.) Joint W3CIOMG Workshop on Distributed Objects and Mobile Code. Boston, MA, 1996 [link].
Excite (http://www.excite.com).
Flanagan,D. Java in a Nutshell - a Desktop Quick Reference for Java Programmers. Cambridge, O'Reilly & Associates, Inc., 1996.
Frakes,W.B. "Introduction to Information Storage and Retrieval Systems". In Frakes,W.B.; Baeza-Yates,R. (Eds.) Information Retrieval - Data Structures and Algorithms. Englewood-Cliffs, NJ, Prentice-Hall, pp. 1-12, 1992.
Fox,E.A. "Rethinking Libraries in the Information Age: Lessons Learned with Five Digital Library Projects". School of Information & Library Science, UNC Chapel Hill, 1996 (http://fox.cs.vt.edu:80/talks/UNC96).
Fuhr,N. (Ed.) Networked Information Retrieval Workshop, SIGIR96 - ACM SIGIR Conf. on Research and Development in Information Retrieval. Zurich, Switzerland, August 18-22, 1996.
Go2Net (http://www.metacrawler.com).
Gonçalves,P.F. "Uma Arquitetura Distribuída Baseada em Código Móvel para Pesquisa e Recuperação de Informações em World-Wide Web", PhD working plan, October, 1996 (http://www.di.ufpe.br/~bright/publications/plano96.html).
Gonçalves,P.F.; Meira,S.L.; Salgado, A.C. "A Distributed Mobile Code-Based Architecture for Information Indexing, Searching and Retrieval in the World-Wide Web. Proc. 7th Annual Conf. of the Internet Society (INET'97). Kuala Lumpur, Malaysia, June 1997a (http://www.di.ufpe.br/~bright/publications/inet97.html).
Gonçalves,P.F.; Salgado,A.C.; Meira,S.L. "Digital Neighbourhoods: Partitioning the Web for Information Indexing and Searching". In Olivé,A., Pastor,J.A. (Eds.) Advanced Information Systems Engineering, 9th International Conference (CAiSE'9T), Barcelona, Catalonia, Spain. Springer Verlag, Lecture Notes in Computer Science 1250, pp. 289-302, June 1997b (http://www.di.ufpe.br/~bright/publications/caise97.html).
Koster,M. "Database of Web Robots, Overview", March 1997 (http://info.webcrawler.com/mak/projects/robots/active/html/index.html).
Sullivan,D. "How Search Engines Work", 1996 (http://calafia.com/webmasters/work.htm).
Merle,P.; Gransat,C.; Geib,J.-M. "CorbaScript and CorbaWeb: A Generic Object-Oriented Dinamic Environment upon CORBA". Technical Report URA CNRS 369, Laboratoire d'Informatique Fondamentale de Lille, 1995.
Microsoft, "Microsoft Index Server Guide", 1997 (http://www.microsoft.com/ntserver/scarch/docs).
Microsoft Corporation, Microsoft ODBC 2.0 Programmer's reference and SDK Guide. Microsoft Press, 1994.
Mitchell,S. "General Internet Resource Finding Tools: a Review and List of those used to Build Infomine". Bio-Agricultural Library, Univ. of California, Riverside, CA, 1996.
Mowbray,T.J.; Zahavi,R. The Essencial CORBA - Systems Integration using Distributed Objects. New York, John Wiley & Sons, Inc., 1995.
Muscat, "Search Engines - Euroferret", 1997 (http://www.muscat.co.uk/euroferret).
NWI, "Nordic Web Index", 1997 (http://nwi.ub2.lu.se/?lang=uk).
Open Text, "Livelink Search Overview", 1997 (http://www.opentext.com/livelink/ll_search.html).
Schwartz,M. (Ed.) Report of the Distributed Indexing/Searching Workshop, Sponsored by the World Wide Web Consortium, Cambridge, Massachusetts, May 28-19, 1996 (http://www.w3.org/pub/WWW/Search/9605-Indexing-Workshop).
Smeaton,A.F.; Quigley,l. "Experiments on Using Semantic Distances Between Words in Image Caption Retrieval". In Proc. of the 19th Annual International ACM SIGIR Conf. on Res. and Dev. in Information Retrieval. pp. 174-179, Zurich, Switzerland, August 18-22, 1996.
Sommers,B. "Distributing Java: Remote Objects for Ja: 1996 (http://www.javaworld.com/javaworld/jw-06-1996/w-06-remote.objects.html).
Sun Microsystems, "Java(tm) Remote Method Invocation Specification". 1996 (http://phanouri.bevc.blacksburg.va.us/ROJ/doc/rmi-spec/rmiTOC.doc.html).
Tai A. "MAUV - Metasearcher at the University of Virginia", April 1996 (http://www.es.virginia.edu/~act9m/mauv/help.html).
Tanenbaum,A.S. Computer Networks. McGraw-Hill, 1992.
Voorhees,E. "Siemens TREC-4 Report: Further Experiments with Database Merging". In Proc. Text REtrieval Conference (TREC-4), 1995.
Voorhees,E.; Gupta, N.K.; Johnson-Laird,B. "The Collection Fusion Problem". In Proc. Text ReEtrieval Conference (TREC-3), Gaithersburg, MD, 1994 (http://www.di.ufpe.br/~pfg/artigos/FTSIRetrieval/VoorheesEtA195.ps.gz).
Yahoo! (http://www.yahoo.com).