Google is a data company. No, it’s not a search engine company, or an advertising company, it’s a data company. I’m talking about their core expertise here.
My thinking on this emerges from a few different discussions I had during my recent visit to the Google Searchology event. First during the presentation by Google VP of Engineering, Udi Manber, there were two interesting comments:
- Google does it’s testing of new algorithms and changes to their UI on a complete copy of the web. This is already impressive!
- With it’s Cross Language Information Retrieval (CLIR) program, Google is performing real time, on the fly translations into 12 different languages that they are supporting with this program.
That means that they are keeping 2 total copies of the web, and are doing real time translation into 12 languages on a query by query basis. That’s some serious data management.
After the main event, when we sat down for lunch, I spoke with Peter Norvig, director of research at Google. Peter is the first person who got me focused on this notion of Google being a data company. He underscored that by telling me a story of the early Google days. The gist of the story was that before Google was conceived, it occurred to Sergey Brin and Larry Page that having a copy of the entire web would be a useful thing, but they did not originally know how they would use it.
It was only after the fact that they thought about building a search engine, once they realized that the web’s complexity would not be easily catalogued in a human edited directory.
Then there’s the conversation that Manoj Jasra of Enquiro told me about. He was talking with Larry Page, who relayed to Manoj the notion that Google would like to get to the point where they are indexing data as you are typing it in (Don’t scream big brother yet, the intention here was that this would only be for data intended for the public).
While this idea is not practical, the earliest Google would be able to get the data is when it gets made public by being published, it does communicate something about the Google mind set.
Regardless of whatever else may happen (e.g. Google acquiring companies like Doubleclick, I would expect them to keep leveraging this core technology expertise for the for seeable future.