Coping With Listing Problems At Google

Several weeks ago, I was talking with Greg Boser, one of the moderators of the Google forum at Greg commented about how people no longer seemed to perceive that there are other crawler-based search engines out there beyond Google. Questions would be posted to the Google forum that were really applicable to crawlers in general, yet those asking seemed only concerned with what Google thought.

It's an understandable reaction. Google's continued rise in popularity has translated into increased traffic for sites that are listed well in the search engine. Indeed, Google is arguably the single-most important search engine on the web today, in terms of potential traffic. It's no wonder that many web site owners are having a Googlecentric view of the search engine world.

Unfortunately, Google has a major problem with its popularity. Web site owners with difficulties getting listed in its editorial results have no guaranteed channel to get their problems investigated. Did your site disappear from Google? Wondering if a move to a new server has caused you problems? You can send feedback directly to Google, post to a forum -- heck, even message me -- but there's no guarantee that Google will answer you.

In contrast, all the other crawlers do have at least one mechanism you can use: their paid inclusion programs. Can't figure out why your URL has been dropped? Well, if you paid to get that URL listed with Inktomi, AltaVista, FAST or Ask Jeeves, then at least paid inclusion gives you an official route to get an answer. These services have an incentive to give you an answer, because you've paid money for your inclusion. You aren't just getting a "free" ride; you are a customer.

I'm not saying that Google needs to add a paid inclusion program. I'm merely pointing out that it lacks any guaranteed channel for site owners to get feedback. As a result, site owners with concerns may not get official answers. Moreover, the concerns seem to be rising, based on feedback from my readers and posts out at forums. The concerns also eerily hearken back to a situation with the former champ of search, Yahoo.

Deja Yahoo

Between 1995-1998, Yahoo was the king or queen of search engine related traffic. Just as with Google, site owners were obsessed with it, because of the sheer amount of traffic a Yahoo listing could bring. And just as with Google, there were complaints. Indeed, the inability for a site to get listed with Yahoo would be the chief complaint I would hear more than any other, when it came to search engine marketing.

You don't hear such complaints anymore about Yahoo. It's not that Yahoo has lost its importance. Yahoo still delivers tons of traffic and is very much a "must" on the submit list. However, in 1999 Yahoo finally opened an "Express" submission service, where site owners could pay for expedited review. This was the single-most important factor in quelling listing complaints.

Google is very much in Yahoo's position today. Paid inclusion may not be the answer to solve legitimate webmaster concerns, but neither will be simply telling site owners to consider advertising on Google if they need guarantees. Without some type of mechanism, the warm, cuddly image Google currently enjoys will be tarnished in the same way that warm, cuddly Yahoo found its image tarnished, in the wake of webmaster complaints.

What are those complaints? Let's examine several common concerns below, to see what may be underlying causes for them.

Missing Pages

Have many or all of your pages disappeared from Google? This might be for a variety of reasons, some innocent and some not. First of all, you need to ensure you understand the difference between not ranking well and not being listed at all. They are not the same.

Let's say you were ranking well for a particular term, such as "cars." You've been in the top results for this term for several weeks, but then one day you discover you've been bumped out. Indeed, you go through several pages of results and can't find any pages from your site listed. Has Google banned your site and delisted you? Not necessarily.

Rankings can change all the time. Google is always adding new pages to its index of the web. Anytime new pages come in, the mix of what ranks well for a particular term is subject to change. A massive update of new pages still also happens on a roughly monthly basis, and that is even more likely to produce ranking changes. However, a change in rank doesn't mean your pages aren't in Google at all.

Before you panic, run a "site" search in Google. This is explained on the "Checking Your URL" page, below. It will bring up all the pages (or nearly all) from your web site. You are likely to discover that Google does list many of your pages. In that case, you haven't been banned from Google. You are listed, but your drop in rank may be due to "natural" changes to the index, rather than an overt penalty by Google.

Low PageRank

Still suspect that Google has it out for you? You might decide to confirm this by loading up a copy of the Google Toolbar. Once installed, the "PageRank" meter will show you on a scale of 1 to 10 how "popular" the page you are viewing is deemed to be based on Google's link analysis calculations.

Armed with the toolbar, you head over to your site and discover to your horror that you have a low page rank, maybe just a 2 (hover your mouse over the toolbar display, and you'll see an exact number). Or even worse, you might get the dreaded "gray" toolbar, showing that you aren't ranked by Google at all. Surely this is a sign that Google dislikes you.

Perhaps not. The PageRank shown by the toolbar is only an estimate of how Google actually rates the page in its index. The toolbar may even report a PageRank for a page Google doesn't even list, making an "estimate" based off the page's host domain. Also, network problems could cause you to get no PageRank score for a page one day, only to find this gets corrected the next.

Matt Cutts is a software engineer at Google who deals with submission issues, and it is worth remembering what he said about the toolbar when I covered similar issues with it earlier this year:

"I wouldnt rely on the toolbar as the on high gospel of what that PageRank is," Cutts said.

It's also worth recognizing that PageRank is not the end-all-be-all to doing well at Google. Plenty of actors are great despite never having won Oscars. Similarly, plenty of pages may rank well even though they don't have starring PageRank scores.

For example, a search on "viagra" brings up plenty of middling 5 out of 10 pages in the top ten results. And going into the second page of results, you can begin to find 6/10 sites showing up. If PageRank were all important as measured by the toolbar, then these PageRank 6 sites out to come before the PR5 sites. They don't, and that's because PageRank is only one of many important components that Google uses when ranking pages. (By the way, avoid site number 11 on that search. Let's just say you don't need to read Russian to understand what's being pitched, and it's not Viagra).

Similarly, try a search for "used cars." Here, a PR3 site about used cars in Chicago makes it into spot number 9, while a Canadian used car site that's PR7 winds up in spot number 10. How about "palm 515?" Hey, it's the Russians again! But this time, different Russians, with a page actually related to the search term. They show up in the first spot with a PR4 page. Second place goes to a PR3 site; fourth to a PR2 site. Spotting a downward trend? No, you aren't -- site number five is a PR5 site.

So, if you've got a low PageRank score, don't automatically assume that Google is overtly penalizing you or that you'll never get any traffic. You might still do well for a variety of terms. Certain continue to do appropriate link building activities can be helpful. Articles on such link building are listed below.

Houston, We've Got A Problem: No Reverse Links Shown

Currently, there is only one fairly easy "official" way to determine whether your site is in the Google "penalty box," as Cutts has put it in the past. Run a reverse link lookup (see the Measuring Link Popularity article below) for your site's home page. If you find nothing is listed as linking to it -- and you know that at least some major links do lead to it -- then this is usually a sign that Google is punishing you.

Cutts reconfirmed that this is still the case, when I spoke with him recently. However, he also stressed the Google penalty box isn't overflowing with bad players.

"The number of people who are actively punished is really small," Cutts said.

What if you are one of those people? Then get in contact with Google via its spam reporting form, below. Yes, this is a page for reporting spam, but it is also a page you can use to ask if you've been blacklisted and plead that you've reformed (assuming you know exactly what you did that was wrong). There's no guarantee you'll get an answer, but it is the place to start.

Don't Assume You've Done Bad

Cutts also stresses that the vast majority of people who assume they've been punished by Google have not. They will see pages go missing and guess they've been bad, when in reality, there are technical problems.

For instance, last April, Google sped up how quickly it was crawling web pages in an effort to increase freshness. This also increased the number of errors it had when revisiting pages, causing some to be dropped. It was a Google technical problem, not a spam penality, that made the pages disappear for a short period of time.

"All these people had assumed they had been punished," Cutts said.

In other cases, people have inadvertently kept Google out of their sites. Badly formed robots.txt files are a chief cause of this. Robots.txt files should be used to block spiders from reaching pages that you DO NOT WANT indexed. If you want your entire site to be indexed, then you don't need to have a robots.txt file at all, and not having one reduces the chances of errors.

Some people try to be tricky with the file. They attempt to restrict certain portions of their web sites, in hopes that the "remaining" portions will be crawled more deeply. Unfortunately, if you don't get the syntax correct, you could block parts of your site that you do want indexed -- or all of it. This is a common problem Cutts has seen.

Virtual hosting is the source of another problem for many sites. With virtual hosting, several web sites all live on the same web server, perhaps even using the same IP address. Done properly, Google (and other crawlers) have no problems with this. However, Cutts said that some ISP may not configure virtual servers correctly, causing Google (and probably other crawlers) problems in reaching sites. This is an issue I expect to return to in-depth in a future newsletter.

Getting Further Reassurance

Clearly, mistakes can happen. These can be on the part of the site owner, their hosting provider or with Google itself. These mistakes can also cause panic for site owners, who then send feedback to Google, which in turn can't answer everything given the sheer volume of information coming it. What's the solution?

I've suggested in the past to Google that one option might be a page validator. Enter your URL or domain name, and Google would automatically report to you any common problems that it found. You could quickly learn if there was a robots.txt block, if the virtual server had a glitch or perhaps even whether the page has indeed been blocked for spamming.

The worry from Google about this type of tool is that it would somehow get abused, enable site owners to better manipulate Google's editorial rankings. There's a paranoia within parts of the company that providing anything to site owners relating to Google's editorial results is enabling a subset of those site owners to weaken the quality of Google's results.

This paranoia is also a big part of why well after a year since Google got tougher on rank checking tools, it has still failed to come up with its own internal ranking solution or licensing program. Rank checking is seen simply helping those who want to manipulate Google.

Of course, Google has good reason to be paranoid. It has built its reputation on having quality search results, and that reputation has further been enhanced by the perceived "purity" of not having any paid products with a connection to those editorial results. That reputation is worth protecting.

Nevertheless, even some at Google recognize that some further help for webmasters might be necessary. The company made a huge stride forward last November when it provided greater help information. Now that listing concerns continue to grow, could some programs be developed to solve them?

"The sheer volume is really hard," Cutts said, about the enquiries from webmasters worried about their listings. "The people who know [at Google, about the volume” think we need to have a webmaster services, but that would really be far off."

What could "webmaster services" entail? Cutts didn't define any specific services, and it should be stressed he didn't even suggest that there are any plans for these services in development. The usual refrain of "Google is always thinking about things" should be kept in mind, here. Most especially, the words "paid inclusion" were not said. Rather, webmaster services more an idea among some at Google that perhaps a more organized, official and guaranteed way of having problems investigated could be developed.

Moreover, unlike Google's advertising program, it might be something not done to earn money but instead to cover costs, "More a safety net, rather than a profit center," Cutts said.

I'd certainly encourage such a development. I think most site owners would. Not all site owners are search engine spammers. Indeed, only a tiny few really are, as the search engines themselves have repeatedly said. There needs to be a better outlet to help solve the concerns of this majority, and as with the case of Yahoo a few years ago, many of them would gladly pay a small fee to get reassurance about their listings.

Such a paid safety net program might never happen -- or at the very least, might be a long way off. In the meantime, what's more likely to happen is further education from Google itself. Cutts said the company might post a checklist or flow chart of possible problems and solutions, to better guide site owners seeking answers to their problems.

How long does pr0 last?, June 6, 2002

Ten pages of posts in just this one thread amply illustrates some of the concerns site owners have over being listed with Google. In particular, there is lots of discussion over worries about low or no PageRank reporting in the Google Toolbar. See the fourth page for a long comment from "GoogleGuy," who is indeed a Google staff member who monitors the forums to provide education. It covers some of the technical problems that Google encounters that can cause webmasters to panic that they've been blacklisted, when they haven't.

Google Toolbar

Toolbars, get your Google Toolbars here.

Google: Report Spam Form

Think you spammed Google? Know you spammed Google? Ask or repent here.

Google: Search Contact Addresses

Just wondering if there's something wrong with how Google is interacting with your site, but you don't think it has anything to do with spam. That probably the case for most people. Try the fourth "webmaster" address listed here -- I'll also reconfirm is this is the best one.

Checking Your URL

Measuring Link Popularity

More About Link Analysis

Explains how search engines make use of links from across the web to find pages and rank them in relation to searches. Includes many tips on how to locate "important" sites and request links that can help you with search engine positioning efforts.

Link Issues And Google
The Search Engine Update, Feb. 4, 2002

Last article covering issues with the Google Toolbar and reverse link lookups, as they relate to spamming.

Blackballed from Google
High Rankings' Advisor, July 3, 2002

Jill Whalen investigates a "have I been banned from Google?" complaint and finds its a robots.txt file misconfiguration to blame -- and one that impacts all search engines, not just Google.

How To Block Search Engines

A beginner's guide to the robots.txt file.

The Web Robots Pages: The Robots Exclusion Protocol

The official word on using a robots.txt file.

Craig Silverstein answers your Google questions
Slashdot, July 3, 2002

Craig Silverstein, director of technology at Google, answers top tech questions about Google from Slashdot readers. One answer touches on virtual domains.

Search Engine Watch's Yahoo Special Report
The Search Engine Report, Sept. 3, 1997

Can't remember the bad old days at Yahoo? Take a stroll down memory lane and see this summary of experiences from over 150 people who had submitted to Yahoo.

Life After Yahoo Discussion
The Search Engine Report, March 31, 1998

A year after the survey above, site owners were still having Yahoo problems. "It would not be asking too much of Yahoo to use some of its rich resources to organize itself so as to make it possible for every web site to be listed in the appropriate category," one wrote.

Yahoo Opens Express Submission Service
The Search Engine Report March 3, 1999

And then Yahoo finally relented, with the first paid participation program of any major search engine, aside from paid placements that launched in 1998 at Overture (GoTo).