Learning About Search Engines From Google Engineers
Want to learn how Google works? A new archive of publications by Google employees offers deep insights into many aspects of the search engine's operation.
Want to learn how Google works? A new archive of publications by Google employees offers deep insights into many aspects of the search engine's operation.
Want to learn how Google works? A new archive of publications by Google employees offers deep insights into many aspects of the search engine’s operation.
The archive is organized by topic, covering the major functions required to run a search engine, such as information retrieval, search engine design and machine learning.
Many of these articles are heavy-duty, industrial strength treatises suitable only for those with the technical background needed to follow the math and logic presented. But some are eminently readable for non-technical folks, such as Searching the World Wide Web, which originally appeared in Science Magazine.
Most links on this page don’t point directly to articles, but rather to abstracts and other useful information provided by Cite Seer from the NEC Research Institute.
Cite Seer is a very cool scientific literature digital library created by a team including Dr. Steve Lawrence, who currently works at Google. Cite Seer results for a particular document offer a ton of useful related information about each paper.
For example, the Cite Seer entry for The Anatomy of a Large-Scale Hypertextual Web Search Engine, the definitive source about Google written by its founders Larry Page and Sergey Brin, includes an abstract, links to articles that cite the paper, an active bibliography (related documents), similar documents based on text, related documents from co-citation… in short, a snapshot not only of the article itself, but numerous links to other directly related articles, all without your needing to search for them.
There are also links to author home pages, to other articles found at the same source (in this case, Stanford University’s database group technical reports, another useful collection in its own right), and so on.
The result page for each article provides links that allow you to read papers in a variety of formats (pdf, postscript, DjVu, etc). Tip: The PDF format generally works best, but you might also try pasting the title of an article (in quotes) directly into a Google search box. If it’s in Google’s index, you’ll also often see a “view as HTML” link that lets you read the article directly in your browser.
Papers Written by Googlers
http://labs.google.com/papers.html
NOTE: Article links often change. In case of a bad link, use the publication’s search facility, which most have, and search for the headline.