Over at WebmasterWorld Tedster references an interesting short paper about creating your own search engine by Googler Anna Lynn Patterson. This document makes for a good read.
The paper was published in April of 2004 when she was a student at Stanford University. She is also the person whose name appears on the recent Google patent application titled Detecting spam documents in a phrase based information retrieval system.
Basically, she breaks it down into hard drive space, having lots of servers, and CPU power. Anna's document is a good initial primer, but there is another aspect of building a search engine that deserves some emphasis.
The search engine companies have built the largest networks of servers the world has ever known. When I think of Google's core technology assets, I don't think about search engine algorithms, I think about massively deployed server networks operating in close harmony.
Introducing... ClickZ Live!
SES Conference & Expo has merged with ClickZ to bring you ClickZ Live! The new global conference series takes on the identity of the industry's premier digital marketing publication, ClickZ.com, and kicks off March 31-April 3 in New York City. Join the industry's leading tech-advertisers in the advertising capital of the world! Find out more ››
*Super Saver Rates expire Jan 24.