Over at WebmasterWorld Tedster references an interesting short paper about creating your own search engine by Googler Anna Lynn Patterson. This document makes for a good read.
The paper was published in April of 2004 when she was a student at Stanford University. She is also the person whose name appears on the recent Google patent application titled Detecting spam documents in a phrase based information retrieval system.
Basically, she breaks it down into hard drive space, having lots of servers, and CPU power. Anna's document is a good initial primer, but there is another aspect of building a search engine that deserves some emphasis.
The search engine companies have built the largest networks of servers the world has ever known. When I think of Google's core technology assets, I don't think about search engine algorithms, I think about massively deployed server networks operating in close harmony.
JUNE SALE! Save 15%*
Save on all e-learning certification courses, including: SEO, Social Media, Online Marketing Foundation, Web Analytics and more. Enter CZAJU at checkout »
Offer expires June 30. *Discount not applicable on SES Online products.