Why You Should Prevent Certain Pages From Being Indexed

Author

Mathieu Burgerhout

Date published October 7, 2010 Categories

Every website has more important pages and less important pages. Unimportant pages are an unavoidable part of the hierarchy or structure of your website. It’s only harmful when you don’t recognize these kinds of pages.

Here’s how you can determine which of your pages are unimportant and prevent them from being indexed. The result will be a clean, mean, money-making online conversion machine.

Common Unimportant Pages

Nine out of 10 times, unimportant pages are the source of duplicate content.

A lot of features aimed at visitors create and generate a lot of unimportant pages. Some commonly used features that are known for generating new and unimportant pages are:

Faceted/filtered navigation
Print page
Tell a friend
Internal search

If any of these features are on your website, the chances are high that there are unimportant pages indexed. Be aware of this and check if any of those “feature” pages can be indexed.

Check with specialized search queries if any of those pages are already indexed. For example, to check if the “tell a friend” option is indexed for our website, we’d execute the following query in Google (locally for better results):

Tell a Friend Search

The extension of the query “inurl” asks the index to only show results for URLs containing “tell-a-friend.” With this query, you’ll be shown all URLs with “tell-a-friend” in it. Definitely unimportant pages, because there is only an input field for an e-mail address and a send button.

Unimportant for your visitors? No, not at all. This is a convenient, helpful feature for your visitors.

Unimportant for search engines? Yes, sir!

Tell a Friend Results

In our case, more than 7,000 known “tell-a-friend” pages were discovered by Google. This is correct, because our website has roughly 7,000 product pages.

These extra 7,000 pages are draining weight from the 7,000 money-making product pages. The “tell a friend” option is on every product page!

Every product gives the “tell a friend” option an unique ID, which generates a new URL. So, 7,000 products generate at least the same amount of “tell a friend” pages.

Crawler

How to Exclude

How do we exclude this kind of pages? There are several options to exclude pages from being indexed, all with their pros and cons. Prevention of indexation can be done by:

JavaScript
Meta “robot” tags
Robots.txt

The best way is to not make the pages available for search engines at all. The proper way would be to not link to the pages. This can be done using JavaScript.

Because search engines don’t execute (a lot of) JavaScript, this is a good method to make the pages unavailable. Especially for faceted navigation, this is the most convenient method to prevent all those extra option pages from being indexed, which are a diversion of the original page anyway (i.e., duplicate).

The far easier option is to exclude pages using meta instructions. This option is easy to implement and fairly effective. Add the meta tag “robots” to pages you don’t want to have indexed. To prevent it from being indexed, you add the “noindex” instruction to the meta.

The instruction can be extended with “nofollow” or “follow.” Add the instruction “follow” to the meta when the pages are already indexed (like the example above). This will prevent the pages from being indexed, but will transfer link juice to all links on the (not indexed) page. A small amount of link juice will be passed. Of course, new pages won’t be indexed at all and therefore will pass zero link juice.

Another effective method is to set exclusion rules in robots.txt to exclude pages or even folders. These rules are followed strictly by all search engines. If you add a rule where you disallow the indexation of all “tell-a-friend.php” pages, search engines will leave the pages alone.

Although this is effective and strict, Google nowadays shows the disallowed pages as “crawled” in the index. Crawled pages can be recognized by a result only showing you the URL as a title. Crawled pages won’t be shown in the search results.

Another disadvantage of excluding pages from the robots.txt: link juice is still passed to your unimportant pages because the links are still there.

Crawler

No nofollow?

Touchy subject, but a “nofollow” isn’t effective enough (anymore) to exclude pages from the index, like our “tell a friend” pages. If anyone can prove me wrong, please do. Isn’t it weird to basically tell Google that an internal page on your site isn’t trustworthy or (possibly) spam?

Conclusion

So, is our website owner stuck with 7,000 indexed “tell a friend” pages? Well, for a while at least. The pages will dissolve over time.

With the right exclusion rules, there will be no new “unimportant” pages indexed and the website’s owner can put more weight on their money-making pages and keep the index clean of unimportant pages.

More about:

Resources

Analytics The 2023 B2B Superpowers Index

The Merkle B2B 2023 Superpowers Index outlines what drives competitive advantage within the business culture and subcultures that are critical to success. It is the indispensable guide for B2B marketers to deliver world-class experiences and keep pace with the dynamic environment. Download Now
Analytics Data Analytics in Marketing

The ClicData survey found that various challenges exist that prevent organizations from achieving such gains. These challenges included inaccessible data formats and limited flexibility in displaying data in dashboards. Download Now
Digital Marketing The Third-Party Data Deprecation Playbook

The need for fraud prevention in the digital world is critical now more than ever. Why? Thinking about your own behavior, consider how you complete transactions and how this has changed over the last 5 years. Download Now
Digital Marketing Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

The need for fraud prevention in the digital world is critical now more than ever. Why? Thinking about your own behavior, consider how you complete transactions and how this has changed over the last 5 years. Download Now

Industry

SEO

PPC

Analytics

Social

Local

Mobile

Video

Content

Development

Information

Follow us

Why You Should Prevent Certain Pages From Being Indexed

Resources

Analytics The 2023 B2B Superpowers Index

Analytics Data Analytics in Marketing

Digital Marketing The Third-Party Data Deprecation Playbook

Digital Marketing Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

Resources

The 2023 B2B Superpowers Index

Data Analytics in Marketing

The Third-Party Data Deprecation Playbook

Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

Related Articles

Optimize Google’s new Interaction to Next Paint metric

The Search Engine Watch Top 5!

The ultimate 2022 Google updates round up

Is Google headed towards a continuous “real-time” algorithm?

The new YMYL guidelines and what this means for marketers

How to drive B2B conversions from your organic traffic

Three critical keyword research trends you must embrace

Why we’re hardwired to believe SEO myths (and how to spot them!)

Follow us

Why You Should Prevent Certain Pages From Being Indexed

Resources

Analytics The 2023 B2B Superpowers Index

Analytics Data Analytics in Marketing

Digital Marketing The Third-Party Data Deprecation Playbook

Digital Marketing Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

Get the Latestdaily news and insights about search engine marketing, SEO and paid search.

Resources

Resources

The 2023 B2B Superpowers Index

Data Analytics in Marketing

The Third-Party Data Deprecation Playbook

Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

Related Articles

Optimize Google’s new Interaction to Next Paint metric

The Search Engine Watch Top 5!

The ultimate 2022 Google updates round up

Is Google headed towards a continuous “real-time” algorithm?

The new YMYL guidelines and what this means for marketers

How to drive B2B conversions from your organic traffic

Three critical keyword research trends you must embrace

Why we’re hardwired to believe SEO myths (and how to spot them!)

Get the Latest
daily news and insights about search engine marketing, SEO and paid search.