How Major Search Engines Regionalize

None of the major search engines have “United States” editions. They make no attempt to only list pages from the United States. Nonetheless, the major search engines tend to be dominated by US-oriented content. Furthermore, their search interfaces are in English and often highlight news or other events of interest to a US audience.

In contrast, regional editions aim to serve those in particular countries or regions of the world. They are created in a variety of ways, from minor cosmetic changes to full-blown content development. The common methods are discussed below.

Regional Interface

Creating a regional interface can be as simple as presenting the same search engine look-and-feel in the appropriate language for a particular country. For example, all the instruction text might be changed to French to create a version for France.

Changing the interface does not affect the results. Unless more modifications are done, a search on a regional edition will provide the same results as using the main service.

Domain Filtering

In domain filtering, a search engine’s “world” index is filtered so that only sites from a particular countrys domain will appear. For example, the United Kingdom’s domain is .uk, so UK-specific web sites should end in that domain, such as:

By filtering out all pages except those from .uk domains, a search engine can create a passable UK-specific edition.

In some cases, the filtering is expanded. For example, a search with Excite Germany also returns matches from sites in Austrian and Swiss domains.

The problem with domain filtering is that many companies outside the US have registered .com domains. That means these companies must be manually included in the results, otherwise they may be accidentally excluded from the regional search engine’s listings.

A bigger problem will occur if new top level domains such as .web come into existence. These will be completely country-independent. Domain filtering will then become difficult to maintain. Additionally, some country-specific domains have already been devalued by those using them for other purposes. See the Goodbye Domain Names, Hello RealNames? article for more about this issue.

Domain Crawling

In domain crawling, a search engine maintains both a “world” index plus a “country” index which provides greater regional coverage. Some of the same pages may be listed in both, but the country index may have greater depth.

Human Categorization

Search engines like Yahoo, LookSmart and the Open Directory that use human beings create regional listings through classification. Sites relevant to a particular country are listed within a country-specific edition of the directory.

Language Specific

Many of the major crawler-based search engines now have their spiders check pages for common words and markers of specific languages. If found, a page is “tagged” internally as being from that language. When users search, they get only these pages.

