Perhaps no other "advanced" search technique causes more trouble than the incorrect use of the Boolean OR operator. Here's why this simple little world can wreak havoc on your search results.
Here's a trick question: What does the word "or" mean? Let's consider two scenarios.
Say you're ordering breakfast, and ask for the #4 special. Your waiter asks, "Would you like your eggs fried or scrambled? Toast or biscuit? Coffee or juice?"
In each case, you'd specify one OR the other. You might choose scrambled eggs, toast, and coffee. Or perhaps fried eggs, biscuit, and juice. But for each type of food, you'd get just one of the two possible choices. Simple and straightforward, right?
Heh. Now try those same combinations with a search engine. The Boolean OR operator tells the search engine to return results that have either or BOTH of the words in the query. With a search engine as chef, your eggs could be BOTH fried and scrambled, you'd have some mutant pastry that was BOTH toast and biscuit, and your beverage would be a blend of coffee AND juice. What a mess!
The simple, seemingly straightforward little word "or" has nearly opposite meanings to people and to search engines. Why? Whereas people think of or as being exclusive, excluding all but one option, search engines do the opposite, using the Boolean inclusive OR to return results that include one or all of the terms in a query.
Making matters worse for the unwary searcher, some search engines automatically insert an OR connector between your search terms, resulting in exactly the kind of mess we described above. No wonder some of your search results seem so strange!
Most of the major search engines at one time or another automatically performed an OR operation on your query. In fact, AltaVista only recently switched to ANDing terms by default -- though it still performs a default OR for some types of queries, typically when there are few web documents containing all of your search terms.
As a rule of thumb, think of "any terms" as the equivalent of OR, and "all terms" as the equivalent of the Boolean AND. "Any terms" will always return many more results than "all terms" simply because it's a much looser requirement.
Even if a search engine defaults to finding any of your query words, it may do a couple of other things that take precedence over the implied OR operation. First, the engine may check to see if your query words form a phrase that can be found in any documents. Phrases are special forms of AND, where all words appear next to each other in the order you enter them.
Next, some engines will also look for occurrences of the actual word "and" or the ampersand (&) symbol between your keywords. This isn't really a Boolean operation -- it's simple pattern-matching that takes advantage of the fact that when page authors use the word "and" they're grammatically connecting the keywords you're interested in, so there's a probable match for your query.
Finally, the defaults can be overridden in most search engines if you use the implied Boolean operators, the plus sign (+) and the minus sign (-). Putting either of these operators in front of a keyword with no spaces forces the engine to return only documents with the keyword (+) or without the keyword (-).
So the next time you're tempted to blame "poor" or goofy results on a search engine, find out first how the search engine is processing your query. And remember, "to or is human," at least in the exclusive sense.
A quick overview of how Boolean searching works, with example queries, links to tutorials, and other useful information.
Search Engines by Search Features Chart
Greg Notess regularly updates this chart showing search engine features, including the "default" processing used by the major search engines.
The Hidden Power of AND
The deceptively simple Boolean AND operator is actually a remarkable power tool for searchers, especially when you're mining for information located on the Invisible web.
NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.