Search Engine Visibility and Site Crawlability, Part 1

Date published 19 December 2007 Author

Eric Enge

Categories

Industry
SEO

At the recent SES Chicago. Laura Thieme and Matt Bailey, both experienced presenters, showed how webmasters could tame the dynamic Web site beast and improve search engine visibility and site crawlability.

This is the first of two articles outlining the SEO problems they identified with dynamic Web sites and solutions. I’ll expand on both.

At the very highest level, dealing with large dynamic Web sites requires IT and marketing to collaborate. The person responsible for the technical search engine optimization is an important member of the team. Without the cooperation of technical SEO support staff, your Web site can end up poorly optimized for search engines.

Keyword Research and Deployment

Do your keyword research up front. You need to understand the language potential customers use to find the products or services you offer. You also want to understand related topics that interest these potential customers.

You then want to match up this understanding of the marketplace with the available content you have, or are willing to develop, and design a logical and clean hierarchy around that.

Laura said embedding keywords into your page names (i.e. the URI) is a powerful SEO tactic. In fact, she recommends existing sites embed the most important keywords for each page into their URI (and then 301 redirecting the old page location to the new page location).

Don’t use the same titles and headers for all the pages of your site. That’s another reason to do keyword research. The long tail of keywords enable you to create a different title and header for every page of the site.

Large sites also have issues with pages being found and indexed. Make sure the most important keywords for your site appear in the title and header for the home page.

Robots.txt Files

Matt pointed out that it’s surprising how many sites have problems with their robots.txt file. He provided a tutorial on how robot.txt files work. You can also find good tutorials on the Web.

I’ve seen many sites that have problems with Robots.txt. While it’s a powerful tool for directing the way search engine crawlers see your site, it’s easy to make a mistake. A single mistake can have catastrophic consequences. So use robots.txt, just use the file with great care.

To illustrate how easy it is to make a mistake, we worked with a client a while back that implemented new versions of their site on a staging server separate from the one used for the main site. This allowed them to look at the site live before launching it.

They didn’t want this duplicate site indexed by the search engines, so they implemented a robots.txt as follows:

User-agent: *
Disallow: /

Experienced readers are already cringing because they know where this is going. Unfortunately, one day during a site update they copied the whole site from the staging server to the live server, including the robots.txt file.

It took 3 weeks before anyone noticed, and that happened because traffic to the site was already crashing. Ouch. Again, use it, but be careful.

Site Maps

The Sitemap is an XML file that lists URLs for a site, as well as when each URL was last updated, how often it usually changes, and its importance relative to other URLs in the site. The protocol was designed to enable search engines crawl the site more intelligently. In principle, this should help the search engine find all of the pages on your site more easily than crawling it naturally.

Matt had a great point when he told the audience you’re better off letting your site being found naturally by the crawler. He referred to Sitemaps as a “fast track to the supplemental index”.

I couldn’t agree more. We don’t use Sitemaps on our clients’ sites.

The two best solutions for the crawlability problem — lots of pages not being indexed — the following two steps are the best solution:

Build an efficient and clean global navigation system on your site. There is no better way to help a crawler find your site than a logical and simple navigation structure. If you do this, the crawler will find its way around.
Get third party sites to link to your site. No on-site search engine optimization strategy will work unless you get these links.

Summary

We’ve outlined three key problem areas with sites that have dynamically generated content: information architecture and keyword research; robots.txt files; and the use of Sitemaps.

When search engines don’t index pages on your site, invest your time and energy in site navigation and inbound link development, rather than using Sitemaps protocols.

In Part 2, we’ll look at tools for analyzing your Web site structure and traffic, as well as more solutions to SEO problems with dynamically generated sites.

Eric Enge is the president of Stone Temple Consulting, an SEO consultancy outside of Boston. Eric is also co-founder of Moving Traffic Inc., the publisher of City Town Info and Custom Search Guide.

Search Headlines

We report the top search marketing news daily at the Search Engine Watch Blog. You’ll find more news from around the Web below.

Adventures in Searchandising, Part 2, ClickZ Experts
Product Research Through Social Networks, ClickZ Experts
Environmental Group Uses Google Grants to Target Iowa Voters, ClickZ News
Paid Search Planning: Do You Really Know Your Customers & Your Market?, Search Engine Land
The People Inside Google’s Black Box, NY Times Bits
Forced Verticals: You Are Not Spam if You Are the Only Option Available, SEO Book
Is This the 11th Hour for Thin Affiliate Sites, Graywolf’s SEO Blog
Is The Mainstream Media Hurting or Helping The Search Industry?, SEM Geek
Local & Social Media Predictions for 2008, Praized
Finally, a Twitter Measurement Tool that works, Jeremiah Owyang

Industry

SEO

PPC

Analytics

Social

Local

Mobile

Video

Content

Development

Opinion

Information

Follow us

Search Engine Visibility and Site Crawlability, Part 1

Leave a Reply Cancel reply

Resources

Analytics The 2023 B2B Superpowers Index

Analytics Data Analytics in Marketing

Digital Marketing The Third-Party Data Deprecation Playbook

Digital Marketing Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

Resources

The 2023 B2B Superpowers Index

Data Analytics in Marketing

The Third-Party Data Deprecation Playbook

Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

Related Articles

The Search Engine Watch Top 5!

Google market pulse for search marketers

How do you hire an SEO manager?

How to manage SEO clients' expectations

Why Esports organizations are losing business due to lack of SEO

Four ways to get smart with no-code or low-code methods: a detailed overvie...

Analysis and advice: Why recipe sites saw huge fluctuations in visibility

The creator economy boom, bust, and need for SEO!