A Bridge Page Too Far?
From The Search Engine Report
Feb. 3, 1998
One of the more popular bits of advice that have been going around in the past few months is the idea of submitting "bridge" pages or "entry" pages to search engines.
These are pages that have been created to do well for particular phrases. They are also known as portal pages, jump pages and by other names. Regardless of the name, they are easy to identify in that they have been designed primarily for search engines, not for human beings.
Bridge pages are not a new technique. Spammers have used them for ages, setting up hundreds of pages to draw in traffic. However, they've now become much more widespread. I knew they'd hit a new benchmark when I recently discovered State Farm Insurance employing them in a highly sophisticated manner.
State Farm is the largest home and auto insurance company in the United States. Most large companies of its stature are loath to do anything related to spamming. But if State Farm is using bridge pages, does that mean they are spamming the search engines? And if a big company can do it, does that mean bridge pages are legitimate?
A survey of the search engines finds that they do not consider bridge pages necessarily to be spam. It all comes down to how exactly those pages are used.
In order to provide guidance, it's important to understand the various ways bridge pages are employed, both technically and in terms of the content they deliver. I can't cover the technical aspects in this article, because it would make the newsletter too long. However, a new page within Search Engine Watch does provide this information. A link to it is below.
In this article, I'll cover how bridge pages are used and abused, to help you avoid trouble if you go this route.
Example In Action: State Farm
State Farm makes a good case study. Technically, State Farm is delivering pages to the search engines based on their IP addresses. This means each search engine sees something tailored for them, while everyone else gets a generic page.
State Farm has a domain used just for its search engine submissions: statefarmins.com, as distinct from the statefarm.com domain. About 50 tailored pages were submitted from that domain to both AltaVista and HotBot, and 18 were submitted to Infoseek. Each page was designed for different topics, such as "auto insurance," "boat insurance," and "find an agent for auto insurance." Multiple pages were submitted in hopes of hitting the right combination. Anyone following a submitted page is automatically rerouted to the site's home page.
State Farm wouldn't confirm any of this for "proprietary" reasons, though they were happy for me to speculate. There's nothing really proprietary going on -- a URL search easily reveals the fingerprints of multiple submissions, and testing showed pages were being IP delivered.
State Farm did well for what are no doubt its top terms. They were page one or two ranked for "auto insurance" with Excite, Infoseek, HotBot and Lycos. They were also page one ranked with Infoseek for "life insurance," "home insurance," "car insurance," and "boat insurance." They had page one or two positions for several of these terms with HotBot and Lycos.
Are Bridge Pages Spam?
At this point, you might think, isn't this spamming? Can you submit all these pages, push the ranking mechanism and tailor pages just for the spiders without grief?
The answer was mixed. Generally, the search engines don't have a problem with what State Farm was doing, because people were not being deceived in the end.
Consider someone searching for "auto insurance." They see the State Farm link and click through. There's no doubt the site is relevant to the topic. The user isn't going to complain, and so the search engines don't have a problem.
"They're not having what some would consider a negative impact on our web results," said David Pritchard, HotBot's marketing director. "I don't think anyone would doubt they are relevant."
HotBot's main concern in this situation is when someone submits so many pages that they crowd other relevant results out of the top listings. Others reflected a similar view, and Infoseek's "More Results" clustering system is meant to help ensure that one site does not dominate the top results.
Relevancy Is Key
But what happens when you get away from relevancy? Imagine you are an insurance agent, and you know people are searching the web for information about Monica Lewinsky. So, you create a page optimized for her name, put a misleading description on it and submit it using your sophisticated IP detection system.
You're going to have problems. Why? It will become clearly obvious to visitors that your site has nothing to do with Monica Lewinsky. They'll complain to the search engines, which take a very, very dim view of this type of behavior.
"The main issue for me is whether or not the site is deceiving my users. I have a responsibility to deliver good data to my users. If there's a situation where my spider sees one thing and my user sees something else, I have a problem with that, because it makes me look bad," said Steve Schneider, who oversees AltaVista's spidering process.
It's similar with Excite: "Where we are most concerned is when a company is using it in the wrong way. The consumer walks away thinking, 'Oh, Hotbot or Excite had the wrong information.' We have to be concerned with how that reflects on the brand," said Excite search product manager Kris Carpenter.
Now let's muddy the water. What if you create a page of links about Monica Lewinsky and post that as a service within your site. Now you have legitimate content, so the search engines shouldn't complain, right?
Or, imagine you have a big web site that's completely database driven. The search engines can't crawl your site because of how the database implements URLs. To make up for this, you create lots of pages for different topics, all relevant to your site. You shouldn't have a problem, should you?
Those are just two of the many situations that are impossible to assess generally. If your pages are relevant, people may not complain, and the search engines may not bother with them. However, if you go to excess, submitting many pages, your behavior may be seen as suspect.
What's Being Hidden?
I haven't dealt with the issue of using traditional spamming techniques on bridge pages. That's a no-no, all the search engines agree. They do not want you stuffing keywords, using invisible text (depending on the search engine), or doing a number of other things they consider hostile.
Unfortunately (depending on your view), IP delivery makes it harder to spot spam. People who wonder why their own sites don't appear for a particular term are often the best spam police. They investigate top ranked sites and discover any irregularities. IP delivery is a crippling blow to this discovery process.
Imagine you wanted to come up well for auto insurance. You saw that State Farm was doing well, but because of IP delivery, you'd have no way to check their source code and see if they were spamming. The most you could do would be to message the search engine.
The search engine support techs are busy people. They get messages like this all the time. Chances are, they'll look at the page and see exactly what you saw, the "real" page and not the page that was delivered to the spider.
This is exactly what happened with Infoseek, when it examined the State Farm pages in response to this story. At first glance, the pages seemed fine. But then they went back and looked at what the spider actually saw, they unhesitantly called the pages spam and removed them from the index.
"It looked OK on the surface," said Sue LaChance Porter, Infoseek's Director of Technology Products. "When I actually saw the page we had spidered, you could see that they were spamming."
(NOTE: Infoseek has later said they do not like any form of redirection altogether, regardless if the page content is spam free).
Spamming may have been done to the other search engines. Some did not check the pages at all, and of those that did, I suspect they may not have retrieved the actual IP delivered page.
As for State Farm, it readily admits to experimenting with submissions to get the best results, though it said it did not intend to spam anyone.
"Competitively speaking, it just behooves us to do the best we can," said Bob Reiner, Manager of Interactive Marketing. "We're certainly not trying to do spam," he added. "Spamming to us is a bad term. We don't want to get involved in that."
Detecting IP Delivery
As an individual, you can't tell if someone is getting past the spam filters, if they use IP delivery. But you can look for clues that may help you guide the search engines to follow up and perhaps keep the playing field more level.
Start with Infoseek. Do the page title and description match what's on the page? If not, you can do what spam vigilantes do and resubmit the page. Wait a few minutes (sometimes up to a day), then do a URL search and see what's now listed. Are the page title and description still different? If so, then spider-specific delivery is occurring.
Take a look at some of the other search engines. Are you seeing the same URL being listed, but with a different title on each search engine? That's more evidence of specific delivery. Also do a search for all URLs from that site. This will quickly show you if someone has submitted multiple pages to test the ranking mechanism.
By now, you have a good idea that someone is being very sophisticated. That doesn't mean that they are spamming, but it does help you better advise the search engines. Tell them that you are seeing spider-specific pages being delivered and explain that you want to ensure they check that spamming techniques are not being used.
If you do this, you may help improve the odds that those hiding spam behind IP delivery may be caught. But don't waste their time. Be certain that this is happening and that it is worth them following up.
Should You Do It?
At this point, you may also be thinking that bridge pages are a way to go. They are certainly attractive for some reasons. Some search engines have different rules about what they allow, so tailoring content may be helpful. Some sites depend on CGI or other mechanisms that keep their content invisible, so having these bridge pages may be the only way to be represented.
Why not do it, or why keep it to a minimum? There are some good reasons:
First, it takes a lot of time. Imagine you create a topic-specific page for each search engine. That's six pages. Now imagine you want to cover five different topics. Now you are balancing 30 pages. Now imagine you are doing this for very popular topics. You may have to change your pages on a daily basis, to keep up with others aggressively attacking these terms.
Some people decide its worth this incredible time investment, but the vast majority would benefit more from developing their sites with real content, building links and performing other types of Internet publicity.
Second, bridge pages tend to be focused around specific phrases. However, people search for information in many, many ways. You can't anticipate all the ways, but sites with good content can tap into a lot of it. This is because their pages are naturally loaded with relevant terms. With the right tuning of a page title and reinforcement with meta tags, these pages can capture those other terms.
That's a key lesson to keep in mind. You don't necessarily have to create entry pages. Every page in a web site is a possible entry point. Optimize those pages, make sure they are properly submitted, and you may be very pleased with the results.
A further reason is that you could inadvertently misstep. To avoid this entirely, follow up with the search engines before starting, if you feel you have a site that needs to have bridge pages because of particular problems.
Some final notes. All of the search engines are looking at ways to apply relevancy not just to a page, but to a site. That means if someone searches for "Hawaiian Vacations," they'd like to determine which sites have substantial (and legitimate) content related to this topic.
Lycos search manager Rajive Mathur said that some this type of move is "inevitable," for Lycos and for the other search engines. As they transition to this, those finding bridge pages to be effective (and not everyone does) may be looking at lost traffic.
"I'm sure they're looking at [bridge pages” and saying, 'But I've having such good success,'" said Excite's Kris Carpenter. "In three to six months, it may have the opposite effect."
Article within Search Engine Watch that provides more details about different types of bridge pages, and how they are used.
Response to this article from the company that manages State Farm's listings.