From: Kendra Smith
Sent: Wednesday, January 12, 2000 1:14 AM
To: M?crosöft Research Tech Talk, Sem. Notice
Cc: Kendra Smith
Subject: UW-CSE Colloq / 1-18-2000 / Burrows / Compaq SRC / The AltaVista Indexing and Search Engine
UW-CSE Colloq / 1-18-2000 / Burrows / Compaq SRC / The AltaVista Indexing and Search Engine
*NOTE* This lecture will be broadcast live via the Internet. See
http://www.cs.washington.edu/news/colloq.info.html for more information.
UNIVERSITY OF WASHINGTON
Seattle, Washington 98195
Department of Computer Science and Engineering
Box 352350
(206) 543-1695
COLLOQUIUM
SPEAKER: Mike Burrows, Compaq SRC
TITLE: The AltaVista Indexing and Search Engine
DATE: Tuesday, January 18, 2000
TIME: 3:30 pm
PLACE: 134 Sieg Hall
HOST: Hank Levy
ABSTRACT:
I'll motivate the talk with an overview of how a web search engine is
organized. I'll then describe in more depth a key component of the
AltaVista search engine: its indexing library. The library manages a set
of inverted files, and provides mechanisms to construct and optimize
complex queries on those inverted files. It is a low-level library; it
does not perform high-level functions such as parsing queries, parsing
text to be indexed, or computing ranking scores. Instead it supplies the
interface to allow these operations to be implemented. The design goals
were to enable efficient queries on bodies of text up to a few hundred
gigabytes in size (e.g. AltaVista) without sacrificing too much
generality, and without giving up on small applications (e.g. mail
directories). The key design choices covered include:
- the use of flat inverted files, and
the techniques to allow their efficient update.
- the byte-level format of the inverted files, and the sequence
of instructions used to parse that format.
- the internal abstractions used to construct complex queries.
At the end of the talk, I'll describe some security-failure and
failure-related issues with the original AltaVista Web site.
Refreshments to follow.
Email: talk-info@cs.washington.edu
Info: http://www.cs.washington.edu