The USPTO published two patent applications yesterday. One from Yahoo and and the other from Microsoft that I think are worthy of a quick post and link here.
The Yahoo app looks at ways to automatically identify and extract data from HTML pages that the patent says is useful in building databases of unstructured web content. Very interesting. The MS patent looks at method for electronic yellow pages to contain listings in a results set for a service merchant that is outside of a specific geographical boundary, but services inside the geographical boundary, to be included in a result set of a search directed to a location inside the geographical boundary.
First Filed: May 4, 2005
Title: Systems and methods for identifying and extracting data from HTML pages
Abstract: Systems and methods for analyzing HTML formatted web pages to automatically identify and extract desired information. A computer algorithm identifies and extracts different pieces of information from different web pages automatically after minimal manual setup. The algorithm automatically analyzes pages with different content if they have the same, or similar, formats. The algorithm is fast and efficient and performs the extraction process quickly in real-time. The systems and methods are useful to build databases from unstructured web information. The algorithm can be used as an agent that captures information about products, and compares prices or other characteristics. It can also be used to populate structured databases that, given the different pieces of information, can analyze products and their characteristics. And it can also be used for data mining applications looking for patterns useful for marketing analyses, or other uses.
Note: Udi Manber, now the person in charge at A9, is listed as a co-inventor.
First Filed: August 12, 2005
Title: Method and system for providing service listings in electronic yellow pages
Abstract: A method and system for allowing a regional service merchant that is outside of a given geographical boundary, but services inside the geographical boundary, to be included in a result set of a search directed to a location inside the geographical boundary. Text and/or glyphs are returned along with the regional service merchant's business listing so as to explain to a user why a business not physically residing in the search area has been included in the result set. An application programming interface ensures that, if a business is listed as a regional service merchant, then the text and/or glyph is stored in association with the business listing.