New Search Patent Filings: October 4, 2006 – Using Google to find the Cable Guy

A few new patent filings from Microsoft, Google, and Microsoft explores the use of self organizing maps in one patent filing, and shows how those could be used to make geographic searches more relevant. They also explore URL canonicalization in another.

Google’s newest patent filing expands upon Google Transit and Maps by providing information about taxi cabs, shuttles, limos, delivery trucks, moving vans, and other service vehicles.

Amazon is granted a patent for recommendation services.


This first patent application discusses the use of self organizing maps to increase search relevancy. It points to Self Organization of a Massive Document Collection as a reference source for readers of the patent filing, to help them understand how such a process could be implemented.

System and method for improving search relevance
Invented by Christopher Weare
Assigned to Microsoft
US Patent Application 20060218138
Published September 28, 2006
Filed on March 25, 2005


A system and method for performing context based document searching is provided. A grid of content tiles is constructed corresponding to a desired concept space. Each content tile is assigned a content tag and is associated with a series of feature values. The feature values are trained to correspond to various regions of the content space. Documents are associated with one or more content tags based on a comparison of document feature values with content tile feature values. A search query is modified to include one or more content tags based on the terms in the search query and/or user preferences. The search query is then matched to documents associated with content tags contained in the search query.

The ideas in the previous document from Microsoft could be used to help increase the value of some specialized searches, such as ones based upon geographical location information. This next patent application is a companion filing to that one, and relies upon the same technology to help with searches where location is important.

System and method for location based search
Invented by Christopher Weare, Ashley Feniello, and Randy Kern
Assigned to Microsoft
US Patent Application 20060218114
Published September 28, 2006
Filed: March 25, 2005


A system and method for performing geographic based document searching. A grid of location tiles is constructed corresponding to a desired geographic area. A location tag is assigned to each location tile. Documents are searched to identify a geographic location. The documents are associated with one or more location tags based on the location tiles corresponding to the identified geographic location. The geographic location of a search query is also identified. The search query is modified to include one or more location tags corresponding to the location of the search query. The search query is then matched to documents associated with location tags contained in the search query.

Not long ago, three researchers from Technion, including Google’s Ziv Bar-Yossef, published a paper called Do not Crawl in the DUST: Different URLs with Similar Text. The following patent filing from Microsoft’s Marc Najork visits some of the same territory, looking carefully at ways to pick the best single URL for pages that are substantially similar yet are at different URLs.

Systems and methods for inferring uniform resource locator (URL) normalization rules
Invented by Marc Alexander Najork
Assigned to Microsoft
US Patent Application 20060218143
Published September 28, 2006
Filed: March 25, 2005


Different URLs that actually reference the same web page or other web resource are detected and that information is used to only download one instance of a web page or web resource from a web site. All web pages or web resources downloaded from a web server are compared to identify which are substantially identical. Once identical web pages or web resources with different URLs are found, the different URLs are then analyzed to identify what portions of the URL are essential for identifying a particular web page or web resource, and what portions are irrelevant. Once this has been done for each set of substantially identical web pages or web resources (also referred to as an “equivalence class” herein), these per-equivalence-class rules are generalized to trans-equivalence-class rules. There are two rule-learning steps: step (1), where it is learned for each equivalence class what portions of the URLs in that class are relevant for selecting the page and what portions are not; and step (2), where the per-equivalence-class rules constructed during step (1) are generalized to rules that cover many equivalence classes. Once a rule is determined, it is applied to the class of web pages or web resources to identify errors. If there are no errors, the rule is activated and is then used by the web crawler for future crawling to avoid the download of duplicative web pages or web resources.


Google has been adding cities to its transit service, provides information about traffic congestion in some areas, and has supplied driving directions for quite some time. Can they expand their services to help us hail a taxi, find out how close the Fedex truck when delivery a package, and let us know where the cable guy we are waiting for is?

User location driven identification of service vehicles
Invented by Mark Crady, Michael J. Chu and Russell Y. Shoji
US Patent Application 20060217885
Published September 28, 2006
Filed: March 24, 2005


A vehicle position aggregation system receives position information for service vehicles from various fleet management systems, and maintains the current location of the vehicles in a database, including information identifying each vehicle’s associated fleet and related contact information. End users can query the vehicle position aggregation system to obtain information about service vehicles in the vicinity of the user’s input location.

There have been a few patent filings from Amazon on recommendation systems. This one looks at similarities between items to make recommendations.

Personalized recommendations of items represented within a database
Invented by Jennifer A. Jacobi, Eric A. Benson, and Gregory D. Linden
Assigned to
US Patent 7,113,917
Granted September 26, 2006
Filed: May 7, 2001


A computer-implemented service recommends items to a user based on items previously selected by the user, such as items previously purchased, viewed, or placed in an electronic shopping cart by the user. The items may, for example, be products represented within a database of an online merchant. In one embodiment, the service generates the recommendations using a previously generated table that maps items to respective lists of “similar” items. To generate the table, historical data indicative of users’ affinities for particular items is processed periodically to identify correlations between item interests of users (e.g., items A and B are similar because a large portion of those who selected A also selected B). Personal recommendations are generated by accessing the table to identify items similar to those selected by the user. In one embodiment, items are recommended based on the current contents of a user’s shopping cart.

My usual reminder about patents: Some of the processes and technology described in patents are created in house, and some are developed with the assistance of contractors and partners. A percentage are never developed in a tangible manner, but may serve as a way to attempt to exclude others from using the technology, or even to possibly mislead competitors into exploring an area that they might not have an interest in (sometimes skepticism is good.)

There are times when a Google or Yahoo acquires a company to gain access to the intellectual property of that company, or the intellectual prowess and expertise of that company’s employees. And sometimes patents are just purchased.

Want to comment or discuss? Visit our Search Technology & Relevancy area of the Search Engine Watch Forums.

Related reading

Simple Share Buttons