Figuring out what works and what doesn't for a search engine can take time, and lots of trial and error.
Sure, you can read what the experts and self-proclaimed experts say and take their word as gospel, but that's a dangerous game. I've seen several posts from some "leading lights" in the search community over the last few months that have had me sit up and say "huh?"
In the past I've also dealt with in-house experts who had experienced one data point over a year ago, and from that they extrapolated a full theory on what search engines like and don't like.
Presumably, these people don't do it maliciously, but such misinformation can prevent you from reaching your goals.
So where should you get your information from? How about directly from the search engines?
That doesn't just mean subscribing to the search engine blogs, reading the webmaster guidelines, or even poring over what was and wasn't said by search engine reps at the latest SES. Instead, the tools they provide give you more and more information about what the search engines like and don't like.Google Webmaster Tools
Most Search Engine Watch readers should be intimately familiar with Google Webmaster Tools, but have you noticed one of the latest additions?
Last week Google added News XML Sitemap error feedback into the tool. So if you want to know why a particular article isn't showing up in Google News, all you have to do is pop over to Webmaster Tools to find out that the article was deemed to be fragmented, or too short. Then all you have to do is adjust your production process to make sure that all of your articles will appear in Google News.
IIS Search Engine Optimization Toolkit
It's not just Google though. Bing also has a valuable tool, their IIS SEO Toolkit, which provides a nice amount of information about what they like and don't like.
A couple of quick disclaimers before you rush off to install it. This tool requires Vista or Windows 7 on your machine, the installation process isn't particularly easy, and if your site has major architectural issues this tool may fail to run correctly.
Once you do have this tool, from a command line type "inetmgr" (which I couldn't find in the documentation, but found on a helpful forum), click on the Search Engine Optimization Site Analysis button, click on "New Analysis," enter your starting URL, and enter the max number of pages to spider (you can go up to 999,999, but typically running it for 10,000 pages will give you a good feel for a site). Once it's finished running, you can then view the reports.
The reports show all of the errors encountered when crawling the site. More likely than not, you'll see a lot more than 10,000 errors for your pages.
What's interesting when you click through to look at each of the individual reports is that they detail some information about why they make a recommendation or regard something as an error. For example, if your site has an old page that redirects to another page that redirects to another page, they let you know explicitly that link equity won't make it from the oldest page to the newest.
Because the search engines provide these tools, replete with such information, it only makes sense to use them and listen to the information they provide. Just remember that search engines constantly evolve so what they tell you today may or may not apply tomorrow.