On a roughly monthly basis, Google updates its index of web pages. This means that pages which no longer exist may be dropped, while new pages that have been found may be added. The index update is also a time when Google may introduce new tweaks and changes to how its ranking algorithm works.
For the average searcher, this changeover period goes largely unnoticed. Google generally acts the same way it always does, despite the fact that under the hood, its old catalog of web pages is tossed out and replaced with newer information.
In contrast, some search engine optimizers watch each index refresh intimately, trying to determine if it heralds a potential rise or fall their fortunes. For them, the changeover period that's come to be known as the "Google Dance" may reveal what seems like dramatic changes in Google.
For example, someone who has a page in the top results that's accidentally dropped from the index (as can happen with all search engines) could see a significant loss of traffic until it gets restored during next month's refresh. But for a typical searcher, the loss of that particular page might not even be noticed, assuming there are 10 other good pages in the top results, for their query.
Given the stress the Google Dance may cause some, I'd like to declare a new illness: Google Dance Syndrome. To suffer, you need to be a close watcher of the Google Dance who has been hurt by changes in the latest index.
(Fair credit notice -- after writing this story offline while traveling, I thought I'd better check to see if anyone had already suggested Google Dance Syndrome as a term already before publishing. Not exactly, in terms of rank suffering. But as a physical ailment, a similar Google Syndrome was coined by WebmasterWorld member NGene in March to describe what happens to those who worry about Google so much. And on an adult webmaster board, there was a reference to Pre-mature Google Dance Syndrome for apparently worrying about when the Google Dance was about to begin, also in March. I don't want to link directly to that adult board, but the curious can follow this Google search to reach it. ).
Watching For Signs Of GDS
It's important to understand that GDS is not communicable nor necessarily indicative of the health of Google itself. Just because an individual suffers from GDS does not mean that everyone is suffering from it. In fact, the Google Dance happens each month without many suffering GDS at all.
"Back in January, we changed some scoring stuff. It was subtle enough that most people didn't notice it at all," said Matt Cutts, a Google software engineer who deals with webmaster issues. "That's a nice thing, when you get an easy win and people don't notice yet the quality improves.
Even those who closely watch for symptoms don't often find things. Anytime the Google Dance happens, some search engine optimizers will begin posting at the popular WebmasterWorld.com forum, trying to determine if there's been an widespread outbreak of GDS. Most months, there are relatively few reported cases.
Last month was different. After the latest Google Dance in mid-May, many people seemed to be suffering GDS. Some noted that Google wasn't finding new links pointing at their web sites. Others complained that sites with spam seemed unharmed. Overall, reports grew so much that a special thread had to be started just to summarize all the issues being raised about the latest index.
Providing medical aid to these GDS sufferers was the ever diligent GoogleGuy, a Google employee who regularly responds to comments at the forum. But all were not comforted by GoogleGuy's ministration that things would improve. Indeed, his comments that better spam filtering and fresher link analysis data would be coming gave some the impression that the current Google index was not very good.
"Why is Google putting this self-described incomplete index out to the public in the first place?," one person asked in the Google Wobbles thread.
Could it be that Google had made bad tweaks in the latest index, changes that hurt not just some isolated site owners but instead webmasters and searchers across the board? If so, then the current rise in GDS could indicate a serious health problem with Google itself.
Hard To Measure Google's Health
Determining Google's health based on reports of GDS is tricky business. While more may be in the hospital than normal, some of the changes Google makes may have been designed to do exactly that. New spam filters are always being tested, as are changes to the ranking algorithm. Such alterations are designed to benefit searchers with better results, even if the side-effect is that few-some-many (depending on your point of view) webmasters feel pained.
For example, last September we had another major GDS epidemic that made news on WebmasterWorld.com, other search engine forums and even emerged into a Wired news article, after many claimed that their GDS was a sign of decreased relevancy overall on Google.
It sounded terrible, but I wasn't flooded with complaints from searchers at the time confirming that Google had gotten worse. (for more, also see the Great Google Algorithm Shift article for Search Engine Watch members). In fact, I got no complaints at all. Similarly, there's been a distinct lack of complaints this time. Moreover, in both times, I also didn't hear from more than one or two webmasters concerned about changes.
My takeaway from this is that Google is probably still working pretty well for most searchers and webmasters alike. I think anyone who has highly optimized their web site may be prone to continued bouts of GDS (and similarly hurt at other search engines, when they make changes). But those who've focused on good, solid content? They should (and do) ride through Google Dances without even noticing that it is happening.
Looking Anew At Links
As said, you need to be careful about interpreting GDS reports as being indicative of a problem with Google itself. But what about the twist after the latest dance, where GoogleGuy's own posts seem to acknowledge that the latest index was not released with the most current link analysis information or some spam filtering.
The answer, at least on the link side of things, is that Google is preparing new changes in how it leverages links as part of its overall algorithm.
"We definitely are looking at the next itineration of algorithm improvements. I think that we're in fine shape now, but I think looking toward the future that there's still are some easy wins we've identified with link analysis that we're going to go ahead and push into production," said Cutts.
The improved system of analyzing links isn't yet finished. Nevertheless, Google did need to refresh its index to weed out stale results and bring in new content. Solution? Use an existing, older "snapshot" of link analysis data for the time being, then bring in improved link analysis data later or as part of a coming index update.
Yikes! That sounds awful, sort of like saying you want to make a sandwich but need to use some bread that's a little stale. It's certainly better to have fresh data. However, it's also true as Cutts explained that lacking the very latest link data may not make much of a difference for many queries, where existing links already provide a great deal of knowledge.
"Every index has to pass a full battery of tests to say this is of sufficient quality," Cutts said, to underscore the fact that in Google's view, the current release is indeed ready for prime time.
What about the issue that Google has an index out there that's missing some spam penalties? Similar to the situation with link analysis, Google has some new spam filtering systems that it is preparing to release. So references about spam filters yet to be applied in the index relate more integrating scoring from the new systems, once ready, rather than Google simply not having any spam penalties in the index at all, Cutts said.
Cutts readily admits that it's possible to find pages in the current index that use tactics Google does not like, such as hidden text and hidden links. It's hoped that the new filters will help better eliminate this, in the future. However, Cutts added that the presence of such pages doesn't necessarily translate into bad relevancy.
"For a long time, these things have been annoying webmasters rather than users," Cutts said. "Scoring already takes care of this stuff, but we have seen posts like, 'Why isn't Google handling this'."
One email I did receive about the recent change, echoed by comments I've seen elsewhere, is why a search at Google might bring up different results if rerun just a few seconds later. Here, understanding the dance part of the Google Dance name may be helpful. And to understand, let's flashback first to a different search engine, the AltaVista of old.
Back in the early days of AltaVista, all the pages that the search engine had resided in one big, powerful mainframe-style computer. Eventually, one computer wasn't enough, so the index was spread across four different mainframe computers. That helped with storage but not with query load. As a result, AltaVista made a duplicate copy of its index, a "mirror" which was kept in a different physical location.
As the system got more complicated, there was a greater chance that something could go wrong. For example, if one of AltaVista's four computers went down, then essentially 1/4 of its index was unavailable to any searchers who were unknowingly directed to that mirror. If they were suddenly switched to a different mirror when trying again, they might then hit the entire index and get different results.
Now let's fast-forward to Google today. Google (like other search engines) distributes its index across hundreds of computers with a processing power similar to that used on your desktop. That solves the storage problem. But what about query load? To help, Google has multiple copies of its index in various location. When you search, you might hit a copy of the index located on the West Coast of the US, the East Coast or perhaps in Europe, to name some examples.
If the mirror you hit has a few of its computers down (which is fairly common), then some pages might not be available for searching. It's not as bad as in the old AltaVista days, because if 10 or 20 computers aren't working, that's a tiny amount compared to the hundreds that still are. Nevertheless, having some computers down at one mirror could cause the results to be slightly different if you get directed to a different mirror on your next search.
And now to the "dance" part. When Google updates its index, it has to spread the new information across these hundreds of computers in various locations. It generally takes a day or two until the new information is seeded and stable. As a results, some of the results may seem to "dance" around with slight changes, especially to webmasters who monitor positions like hawks.
So if you've done a search, then repeated it and gotten different results, two things are likely. First, you may have hit a different mirror of the index on your repeat search where the copy isn't perfectly in line with the first index. Second, and more likely, you've done a search and seen the Google Dance in action. And to confirm if it's happening, consider visiting the Google Dance Tool.
Blogs To Stay
By the way, one thing NOT in the cards for future index changes are any plans to pull blog content out of Google's regular search results. Google made a special point of stressing that blogs are staying, during my interview with them last week.
The idea that blogs were to go came out of a Register article last month. The piece suggested that if a "blog" tab was eventually added to Google, blogs themselves would be removed from the main web page index to increase relevancy. As proof of this, the Register said this is what happened to Usenet posts after Google "acquired Usenet groups" from Deja."
First of all, Google didn't acquire Usenet groups -- no one owns Usenet groups, any more than anyone owns the web. Instead, Deja had archives of posts made in those groups. Google acquired those and then began crawling Usenet to add to the archives. As Usenet information had never been part of the web index, there was nothing to "pull."
So if a blog index is created, it's not a given that blog content would be pulled. Indeed, Google has not pulled directory or news listings from the web index even though both types of content can be found via their own tabs.
And will a blog tab really be coming? Eventually, sure. But it's not something in any immediate plans, Google says.
GDS: Canaries Or Chicken Little?
I've written and spoken before that in the search engine coal mine, there are two types of canaries that spot danger: research professionals and webmasters/webmarketers. Both groups intimately study search engine results and notice changes before ordinary data miners -- the average web searcher.
So when webmasters start reporting GDS, their concerns have to be seriously considered -- and especially during a large epidemic as currently has happened. Some of those suffering from GDS do indeed represent changes at Google that may be bad or imperfect for searchers. Even Google knows this. "Is Google perfect? Of course not," wrote GoogleGuy, in one of the many recent threads emerging out of the latest Google Dance.
But despite Google's imperfections, GDS reports do not necessarily mean the sky is falling for Google itself. How to know for certain if it is? Abandonment by searchers is also a sure sign, though that's a long-term trend.
In the short term, an outcry by both researchers and SEOs lends more weight that something is wrong. Otherwise, I'd watch for major and growing GDS outbreaks among SEOs happening for several months in a row before seriously thinking that Google itself has made some type of terrible mistake.
Meet Your Favorite Search Engine Watch Contributors
Many of SEW's leading expert contributors will be at ClickZ Live, the new online and digital marketing event kicking off in New York (March 31-April 3). Hear from the likes of: Thom Craver, Josh Braaten, Lisa Barone, Simon Heseltine, Josh McCoy, Lisa Raehsler, Greg Jarboe, Dan Cristo, Joseph Kerschbaum, John Gagnon, Eric Enge and more!