Search Spotlight:AltaVista Searches

Search Spotlight is a rotating feature that focuses on the different ways people search for information. This month, it looks at findings from an analysis of over one-half billion searches at AltaVista. Researchers at the Compaq Systems Research Center found:

  • Most users only look at the first page of results (68%).
  • Most users perform only one search per visit (77.6%).
  • Most queries are short, one or two words long (65.2%).
  • Most people who modify their searches add or subtract words (63.4%), but a significant number try completely new terms (35.2%)
  • Most queries make no use of search operators (80%)
  • Most queries are not used by more than one user (63.7%), though the most popular terms are used by many users and made up 1.5% of the total searches.

Researchers collected queries over 43 days, from August 2 through September 13, 1998. Surprisingly, 15% of searches were empty -- they had no words at all in the search box! (Infoseek chairman Steve Kirsch also told me in 1998 that the most popular search term at his service is an empty query, so this doesn't seem unusual just to AltaVista).

For the queries that weren't empty, the researchers found that users viewed only the first page of results 68% of the time, indicating that they either found what they were looking for immediately or were discouraged and gave up immediately. The researchers had no way of knowing which was true. Users went beyond the first page of results only 32% of the time.

Somewhat related is the fact that users tended to do only one search per visit to AltaVista, with a visit being defined as any activity by the same person (as identified by cookie) without a break of 5 minutes or longer. In other words, if someone came to AltaVista and made several queries, that would be a visit. If they went away and came back an hour later, the break would have been longer than 5 minutes, and so their return would be considered a new visit.

Researchers found that 77.6% of users made only one query per visit, 13.5% made two queries, 4.4% made three queries and 4.5% made three or more queries. These numbers include blank queries.

Researchers also found that queries are relatively short. Most queries had two words in them, 26%, closely followed by one word, 25.8%. However, these numbers are skewed because blank queries are not removed, and they make up 20.6% of the total. When you remove those blank queries, the breakdown is more dramatic:

1 word 32.5%
2 words 32.7%
3 words 18.9%
More than 3 words 15.9%

It turns out that the majority of non-blank searches, 65.2%, involved only one or two words in them.

The researchers also looked at how people modified their searches during a visit. Remember, the majority of users do one query per session and leave. That means the remainder, 22.4%, probably try to modify their queries. The statistics below cover how these users try modifying their searches.

Not surprisingly, most modifications involved adding or subtracting words: 63.4%, in all. But a significant number of queries, 35.2%, involved people trying again with completely different search. This is a bit misleading, though. It assumes that they indeed were trying a new search to discover the same information from the preceding search. It is possible some of these people may have decided to look for something completely different. It is also possible that automated tools such as position checkers and search utilities are skewing the results. Finally, a tiny 1.4% of modifications involved only altering search operators.

Search operators, such as the + and - symbols, are also rarely used, the research found. About 80% of queries used no operators at all, while about 10% used only one operator and 6% used two operators.

Out of the 575 million total searches, there were 154 million unique search terms. That means one out of four searches were for something no one else had searched for during the six week study period -- or at least no one had searched for in exactly the same way.

The number of unique search terms would be lower, possibly significantly so, if capitalization hadn't been taken into account. Researchers considered a search for the same term to be different if one search used capitalization while another didn't. Thus, a search for "dog" would have been seen as unique and different than a search for "Dog." Certainly AltaVista treats them as different searches, because it is case-sensitive. However, most AltaVista users are probably unaware that this is happening and so treating capitalized terms as distinct from non-capitalized forms is probably not best when considering user behavior.

Conversely, the number of unique search terms might have been higher if word order and operators had been taken into account. For instance, the researchers would have considered all of these to be exactly the same search:

dog cat
cat dog
+dog cat
-dog cat

Despite these concerns, it seems pretty clear that users search for things in a wide-variety of different ways. In fact, of the unique search terms, 63.7% appeared only once during the study period; 16.2% appeared twice, and 6.5% appeared three times. After that, the numbers begin rising again: 13.6% of the terms appear more than three times. This last set of terms are almost certainly popular, common queries asked by many people.

An example of these are the 25 most popular searches, shown below. These terms made up 1.5% of the total searches surveyed. Interestingly, the second most popular search term -- "applet" -- was almost entirely requested by an automated agent.

Most Popular Searches at AltaVista
Aug. 2 - Sept. 13, 1998

Term Searches
sex 1,551,477
applet 1,169,031
porno 712,790
mp3 613,902
chat 406,014
warez 398,953
yahoo 377,025
playboy 356,556
xxx 324,923
hotmail 321,267
[non-ASCII query” 263,760
pamela anderson 256,559
p**** 234,037
sexo 226,705
porn 212,161
nude 190,641
lolita 179,629
games 166,781
spice girls 162,272
beastiality 152,143
animal sex 150,786
SEX 150,699
gay 142,761
titanic 140,963
bestiality 136,578

Analysis of a Very Large AltaVista Query Log
Compaq Systems Research Center, October 26, 1998

You can download a copy of the actual research report here, though its only available in Acrobat/PDF and PostScript formats.

What People Search For

Still curious about how people search? Look at this page in Search Engine Watch for more resources.