AltaVista introduced a number of significant new search features in October, including the display of relevant results from the Ask Jeeves answer service, a new filtering option and a photo search service.
Those performing a standard web search at the service may now receive information from up to four different sources on the search results page, a combination that AltaVista has dubbed "Full View Searching." The hope is that more sources mean better odds that a searcher will come away with relevant results.
Most notable are the new "Ask AltaVista" results, a branded version of the Ask Jeeves service. These results appear in response to many queries, under the heading of "AltaVista knows the answers to these questions."
These Ask Jeeves results gives AltaVista a real edge in terms of relevancy over its competitors, because the results are often exactly on target in response to a query.
A separate article below describes the Ask Jeeves system in more detail. In short, editors have created 7 million questions, which are linked to web pages that provide answers.
For instance, when you search for "software," Ask AltaVista responds with questions such as "Where can I find product reviews for software?" Clicking on the answer button takes you to a site with reviews.
Doing the same search at Ask Jeeves will reveal many more answers. AltaVista is saving screen real estate by listing only the first answer, though in some cases, it will display a "More Answers" link. A better way of getting more suggestions is to phrase your search as a question.
For example, "Where can I buy software?" causes AltaVista to display several questions, such as "Where can I find good deals on new software" and "How can I buy wholesale computers, peripherals & software?"
The second source of search results is from RealNames, through a partnership AltaVista established earlier this year. RealNames is an alternative web site address system, especially meant to deliver people directly to the correct site when doing brand-oriented searches such as "nike" or "barnes & noble." A past Search Engine Report article listed below describes the system in more detail.
The RealNames link used to be at the top of the page. Now it comes between Ask AltaVista results and matching web pages. Look for the description that says, "Official company or product home page by RealName."
The previous prominence of the RealNames link had been good news for webmasters, who have registered semi-generic names. Now the demotion of the link may mean a drop in related traffic.
To explain, understand that RealNames won't register a generic name, such as "shoes," to any site. So if someone searches for "shoes" at AltaVista, then clicks on the RealNames link, they will be taken to the RealNames search engine and shown a list of RealNames that contain the word "shoes."
Thus, someone who registers a name like "Tom's Shoes and Boots" has been getting indirect traffic. It can be quite significant and usually well worth the $100 yearly fee per name.
With the RealNames link no longer occupying top position in many of AltaVista's search results, this traffic is likely to drop. However, the RealNames partnership should begin soon with LookSmart, and RealNames is promising that another significant partnership is in the works. So, it is likely that registering names will still be worthwhile for many sites.
Matches from the AltaVista web index make up the third source of information being offered to searches. Here, the service has introduced a number of changes meant to improve the relevancy of its core offering.
First, AltaVista says there has been a relevancy change, especially designed to improve results for simple and popular terms.
"We have totally revamped our ranking algorithm," said Louis Monier, AltaVista's chief technical officer, "Not only for technical questions, but also the simple questions such as cars, real estate or wine."
Monier was closed mouthed about what specifically has been done to improve results, but they do not seem to include non-textual criteria, such as link popularity or boosts for reviewed sites, which has been remarkably effective at Infoseek.
In fact, I ran a quick comparison of the examples Monier cited against Infoseek's results. I felt Infoseek's results for these generic terms were impressive and far better than AltaVista's.
At AltaVista, there does seem to be a stronger emphasis on home pages in some cases -- these appear to carry slightly more weight in the system in response to single word and popular searches. Other positive factors seem to include short, focused titles, use of phrases in the meta keywords tag and minimal repetition of words in meta tags. There are exceptions to all of these, of course -- your mileage may vary.
Other behind the scenes changes are also at work, with one of the most important being automatic phrase searching.
Many people would benefit if they did phrase searching. For example, a search for "new york" -- with quote marks around the words to indicate an exact phrase should be matched -- is likely to provide better answers than a search that simply looks to see if any or all of the words appear on a page.
Since most people don't bother with using special commands like the quote marks, AltaVista has made it easier for many by making phrase searching automatic.
"We have built a huge dictionary of several million phrases," said Monier. "Essentially, the need for quotes has gone down to zero. If you look for New York, you don't need to quote New York."
Monier said the phrase detection has been built through a completely automated process, which takes advantage of linguistic patterns to determine what is a phrase, and what isn't.
Determining if phrase searching is active for a query is easy. Simply repeat the search using quotes. If phrase searching was on, the results should be the same. The match count should also be very close, though it may be slightly different given AltaVista's long standing problem in delivering accurate counts. If phrase searching was not on, you should see slightly different results and very different counts.
You cannot override the function in basic search mode, but it is not active when performing an advanced search.
AltaVista has also introduced spell checking, a first for a major search service, and something that is long overdue. Yahoo has been experimenting with it off and on for several months, but at AltaVista, it's now a full-time option.
"We estimate one in five queries is misspelled," Monier said, such as "virtual" spelled as "vurtual." In other cases, people fail to separate words, such as "networkcomputer" instead of "network computer."
AltaVista won't change a misspelled query. Instead, it offers a prompt on the search results pages with other suggestions. When triggered, they appear just below the search box with the words "Spell check." Spell checking works for queries in English, French, Spanish and Italian.
One change AltaVista could really use is results clustering. Often, the top results seem to be dominated by pages from one site. Do a search for "sports scores," for example, and six out of the top 10 results are from CBS SportsLine. SportsLine operates various domains, and AltaVista is listing the same page under slightly different addresses. Clustering could help improve the chances that one site wouldn't crowd out others like this.
Do a search for "mad cow disease," and you'll see a similar situation where two sites occupy five of the top ten listings. Additionally, spots two and three are taken up by pages that aren't in English. They do contain the search terms, but they are predominantly written in Japanese.
I've seen a lot of cases like this at AltaVista, where non-English pages dominate the top ten for English-language searches. One solution is to select "English" from the drop down box that sits just above the search button. That will return only pages identified as being written in English. If you don't specify English, then any page containing your search words may be listed, even if it is predominantly in another language. Naturally, set options accordingly if you only want pages in other languages.
Finally, AltaVista's last source of information is the LookSmart directory. Matching categories from the AltaVista-branded version of the directory appear after web page matches, in a section called "AltaVista Recommends."
These suggestions don't appear in response to all search queries. However, the entire AltaVista directory can be browsed from the home page. Simply start from any topic listed in "Categories," on the left-hand side of the home page, below the search box.
Also on the home page, there are two new search services being offered: filtered search and photo search. Links to both can be found just under the search box, in the "Specialty Searches" section.
AltaVista Family Filter is designed to protect children from seeing objectionable material when searching on innocent topics. For example, a search for "toys" may list pages offering sex toys, complete with graphically explicit page titles and descriptions. AltaVista Family Filter aims to prevent this from happening.
"Once you are in filtered mode, you should not see content from objectionable sites," said Monier.
AltaVista is the second major crawler-based search engine to offer this function. Lycos launched its service, SafetyNet, earlier this year. A third crawler-based service, Searchopolis, is currently in beta. It is powered by Inktomi.
AltaVista's filtering works in three ways. First, the spider tags pages as objectionable, if it finds certain words and phrases used in particular ways. Second, it then uses a filtering process developed in partnership with SurfWatch to catch anything that made it past the spidering identification. Finally, AltaVista allows users to report on any pages that may have slipped through the first two barriers.
Additionally, you should be able to still discover pages that contain sexual terms but which are not pornographic in nature, according to Monier. "It can distinguish between a medical page and a smut page," he said.
For example, a search for "breast cancer" should bring up relevant results about the disease and treatment, but not porn pages, when filtering is on.
To enable filtering, follow the link from the home page. Choose the "Start" button on the next page that appears. When enabled, a small red + symbol will appear in the page header area, to remind you that filtering is active.
AltaVista's new Photo Finder is an impressive new service. Photo search has sort of died away as a major offering with the search engines. Until now, only Lycos was still offering a dedicated crawler-based service. Now AltaVista has entered the scene.
"It is the largest index of images on the web. We have over 11.5 million pictures, most of them obtained by crawling the web," said Monier.
Lycos might dispute the title of largest, as it claims to have 18 million images indexed. But where AltaVista wins hands down is usability. Its results contain thumbnail pictures of images it has found, where Lycos displays only text descriptions. Consequently, it is much easier to spot images of interest at AltaVista.
Not everyone is happy with the thumbnail display, however. While it is great for searchers, some photographers and artists are concerned that AltaVista is violating their copyright by making copies of their pictures without explicit permission, in order to present the thumbnails. See the article below for more details on this issue.
Only about 1/2 million of the pictures in the AltaVista service come from its partnership with Corbus. The rest have been identified by AltaVista's spidering system. In particular, it's the spider hosted at vscooter.av.pa-x.dec.com that many people have seen in recent months that has been doing the work.
AltaVista's image identification system operates essentially like that at Lycos. Images are associated with words that appear in their ALT text, file names and on the page. A search looks for matches in this text, not within the images. Monier added that some more sophisticated things are also being done beyond this basic operation, but he declined to provide more details.
One sophisticated thing that AltaVista does offer beyond Lycos is the ability to do a true image search independent of text association. Users can ask to see more pictures that are visually similar to any picture selected.
AltaVista has also introduced one other new feature, meant to make life easier for repeat users: a shorter address. The service can now be reached at http://av.com/.
Real Name Tops At AltaVista
The Search Engine Report, June 3, 1998
Kids Search Engines
More services meant to be safe and friendly for children. A link to Searchopolis, mentioned in the article, can be found here.
A new, Inktomi-powered filtered search engine, currently running in beta.
Lycos Pictures and Sounds
The Lycos multimedia search service. It features 40,000 images organized by category, from the PicturesNow catalog. You can browse categories and view thumbnails of these pictures. Search mode lets you scan the web for pictures or sounds of interest, but no thumbnails are provided.