Andy Beal has a nice write-up of Google showing off its word clustering tools at the recent Web 2.0 conference: Web 2.0 - Exclusive Demonstration of Clustering from Google. Jason Calacanis also has an MP3 audio file of the presentation you can listen to.
Neither Google Sets or Related Searches provide clustering as was demonstrated or as can be seen via Vivisimo (or Vivisimo's recently launched consumer site Clusty). But some of the underlying clustering technology may be used for these.
Also interesting is mention of Google excluding "noisy" data to focus on the key part of a page. It's common that search engines may ignore "stop words" such as "the" when indexing or searching. However, Google's "named entities" would go beyond that to focus on the core content of a page.
Both clustering and named entities have interesting applications to searchers and search marketing. By understanding clusters of search results, it may be easier for Google (and other search engines) to determine pages that don't seem to belong somehow on a particular topic -- in particular, spam pages that given their often artificial nature might stand out more.
Similarly, understanding the key concepts of a page and first ranking pages based on a concept match, then following on an actual word match, might help eliminate some false poor matches.
Meet Your Favorite Search Engine Watch Contributors
Many of SEW's leading expert contributors will be at ClickZ Live, the new online and digital marketing event kicking off in New York (March 31-April 3). Hear from the likes of: Thom Craver, Josh Braaten, Lisa Barone, Simon Heseltine, Josh McCoy, Lisa Raehsler, Greg Jarboe, Dan Cristo, Joseph Kerschbaum, John Gagnon, Eric Enge and more!