Bill Slawski is reporting that Yahoo is joining the ranks of MSN and Google in an attempt to incorporate page layout into its web search algorithm. A new patent filed by the search engine examines how to estimate page elements without rendering the web page the way a browser does. As a result, the process for indexing a page could become faster.
The process involves creating object trees based on structural elements contained within the HTML code of a given web page. The goal is to give more weight to the unique content of a page versus the site-wide static content.
In other words, Yahoo wants to pay less attention to sidebars, headers, footers and other elements that are on every page of a site, and focus on the element that is exclusive to a single page. As a result, the links and content within the unique element will be given more weight compared to the static elements.
Slawski concludes that if you develop your own sites, then looking in-depth at the patent may be worth your while:
If you build web pages, and you want an idea of how a search engine might be looking at and weighing the content of your pages, you may want to spend some time with this patent filing.
Considering that Google and Microsoft also have developed methods to segment the contents of web pages, It’s not a bad idea to get a sense of how they all might be breaking pages down into parts.