To Or is Human

Perhaps no other "advanced" search technique causes more trouble than the incorrect use of the Boolean OR operator. Here's why this simple little world can wreak havoc on your search results.

Here's a trick question: What does the word "or" mean? Let's consider two scenarios.

Say you're ordering breakfast, and ask for the #4 special. Your waiter asks, "Would you like your eggs fried or scrambled? Toast or biscuit? Coffee or juice?"

In each case, you'd specify one OR the other. You might choose scrambled eggs, toast, and coffee. Or perhaps fried eggs, biscuit, and juice. But for each type of food, you'd get just one of the two possible choices. Simple and straightforward, right?

Heh. Now try those same combinations with a search engine. The Boolean OR operator tells the search engine to return results that have either or BOTH of the words in the query. With a search engine as chef, your eggs could be BOTH fried and scrambled, you'd have some mutant pastry that was BOTH toast and biscuit, and your beverage would be a blend of coffee AND juice. What a mess!

The simple, seemingly straightforward little word "or" has nearly opposite meanings to people and to search engines. Why? Whereas people think of or as being exclusive, excluding all but one option, search engines do the opposite, using the Boolean inclusive OR to return results that include one or all of the terms in a query.

Making matters worse for the unwary searcher, some search engines automatically insert an OR connector between your search terms, resulting in exactly the kind of mess we described above. No wonder some of your search results seem so strange!

Most of the major search engines at one time or another automatically performed an OR operation on your query. In fact, AltaVista only recently switched to ANDing terms by default -- though it still performs a default OR for some types of queries, typically when there are few web documents containing all of your search terms.

As a rule of thumb, think of "any terms" as the equivalent of OR, and "all terms" as the equivalent of the Boolean AND. "Any terms" will always return many more results than "all terms" simply because it's a much looser requirement.

Even if a search engine defaults to finding any of your query words, it may do a couple of other things that take precedence over the implied OR operation. First, the engine may check to see if your query words form a phrase that can be found in any documents. Phrases are special forms of AND, where all words appear next to each other in the order you enter them.

Next, some engines will also look for occurrences of the actual word "and" or the ampersand (&) symbol between your keywords. This isn't really a Boolean operation -- it's simple pattern-matching that takes advantage of the fact that when page authors use the word "and" they're grammatically connecting the keywords you're interested in, so there's a probable match for your query.

Finally, the defaults can be overridden in most search engines if you use the implied Boolean operators, the plus sign (+) and the minus sign (-). Putting either of these operators in front of a keyword with no spaces forces the engine to return only documents with the keyword (+) or without the keyword (-).

So the next time you're tempted to blame "poor" or goofy results on a search engine, find out first how the search engine is processing your query. And remember, "to or is human," at least in the exclusive sense.

Boolean Searching
A quick overview of how Boolean searching works, with example queries, links to tutorials, and other useful information.

Search Engines by Search Features Chart
Greg Notess regularly updates this chart showing search engine features, including the "default" processing used by the major search engines.

The Hidden Power of AND
The deceptively simple Boolean AND operator is actually a remarkable power tool for searchers, especially when you're mining for information located on the Invisible web.

Search Headlines

NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.

Online search engines news
HP, Overture ink search engine deal...
CNET Apr 9 2002 1:44PM GMT
Tech latest
New MP3 player could change your tune...
Interactive Week Apr 9 2002 1:09PM GMT
XML and metadata news
An Xml Primer... Apr 9 2002 6:50AM GMT
Uncle Sam: Hold your horses on XML...
Interactive Week Apr 8 2002 4:48PM GMT
Online portals news
Building a case for smart portals...
ZDNet Apr 8 2002 4:47PM GMT
Online search engines news
Microsoft revamps spec for Web searches...
CNET Apr 8 2002 1:54PM GMT
Domain name news
Liverpool evict cybersquatter...
BBC Apr 8 2002 11:49AM GMT
Online search engines news
Overture sues Google for patent infringement...
ZDNet Apr 8 2002 9:39AM GMT
Online portals news
Yahoo results could surprise, but concerns remain... Apr 8 2002 4:48AM GMT
Big Web Portals Offer Similar Shopping...
New York Times Apr 8 2002 1:30AM GMT
Online content news
Pay for Web content? More people seem willing...
Star Tribune Apr 7 2002 4:46PM GMT
Domain name news
US judge awards $159 million in cybersquatting case...
The New Zealand Herald Apr 5 2002 8:45PM GMT
powered by

About the author

Chris Sherman is a frequent contributor to several information industry journals. He's written several books, including The McGraw-Hill CD ROM Handbook and The Invisible Web: Uncovering Information Sources Search Engines Can't See, co-authored with Gary Price. Chris has written about search and search engines since 1994, when he developed online searching tutorials for several clients. From 1998 to 2001, he was's Web Search Guide.