Over the past month, AltaVista has been busy making some under-the-hood changes to improve its search results. In particular, the service has upgraded its shopping search and is "blending" links that lead to its shopping search engine for appropriate queries into its main results. How searches themselves are processed has also been changed.
Do a search for product oriented information, such as "dvd players," "inline skates" or "computers" and you'll see the new shopping search links appear. They look similar to AltaVista's regular numbered listings and appear just before these. They usually begin "Compare Prices and Features on..." followed by your search terms, and the words "Shop Smart" also appear next to the link. Selecting the shopping link takes you to the AltaVista shopping search engine, where you can obtain product pricing from over 600 online web merchants.
"We've learned that 20 to 25 percent [of searches” are shopping related searches, but the average user may not know how to get to our shopping vertical," said Ganon Giguiere, senior director of search verticals at AltaVista. By integrating the links to shopping search in the main results, AltaVista hopes to better serve its users.
Naturally, getting people into the shopping search area also benefits AltaVista. Some of the merchants pay AltaVista to be included, but the "vast majority" don't, Giguiere said. Most major shopping search engines do have some type of payment model with merchants, and as long as there's a wide variety of major, reputable companies, I don't think users need to fear these deals.
I'm pleased to see the shopping links integrated as they are, and I'll be looking forward to watching how AltaVista integrates other links to new verticals that it plans to launch in the near future. For the most part, users would benefit by finding better specialty search tools. It's simply been a big challenge to figure out how best to direct them, as I've covered more in the separate "Being Search Boxed To Death" article (see below in the newsletter).
For the most part, I'd rather one or two targeted vertical links replace all the "dumb" links that appear in response to any query. For instance, search for anything on AltaVista, and you'll always get "Extend Your Search" links at the bottom of the page that include whatever you searched for, even if it is absurd that you would do this -- a search for "korean war" makes this area suggest things like "Shop the web for korean war" and "Searching for korean war? Find it at Casino-On-Net.com." Nor is AltaVista alone in having these dumb links. Removing them would eliminate clutter from the result page and ought to increase usability.
AltaVista's shopping search service lets you compare prices, but I'd also like to see product information added. It would be great to see the service expanded to pick up product reviews and consumer information from key sites, which would help it be more than an online mall.
AltaVista gathers the pricing information for its shopping service though a system it calls "scraping," which means its shopping team evaluates a web site, understands how products are listed and prices displayed, then configures its spiders to pull back the information for use in the service. It can also take product feeds from merchants, especially those who it has relationships with.
If you'd like to be one of the included merchants, it's essential that your site first offer quality products, look reputable, BE reputable and offer features such as a return policy. If you meet the prequalifications, then AltaVista will then negotiate a deal.
"We only charge for qualified traffic. We apply a conversion rate and get into a cost per customer acquired model that works for them," said Giguiere.
If you like the shopping search service, it remains accessible also from its own home page, plus the service can be customized to turn off product pictures, increase the number of results and place the "Shop Near You" section first on the page.
Shop Near You is completely new. This is information that comes from "brick and mortar" stores near your home. You give AltaVista your zip code, and it pulls back matches powered by information from SalesHound.com. This can be especially helpful if you prefer to use AltaVista Shopping to do online research, then find a local vendor with a "real" store that you can visit. In addition, auction listings from uBid for products can also be found.
I noticed some bugs with the shopping link integration. Try a search for "pc computers," for example, and rather than a generic "Compare Prices" style link, you'll get a link to a specific product. I don't think the user experience is as good, in these cases. "Cordless mouse" was another example, but fortunately, I didn't see many other instances.
The next vertical search product to appear will be news. Giguiere said will contain stories from many leading news providers. Finance, real estate and travel verticals are also planned. In all these cases, links to the appropriate vertical service will be integrated as with shopping into the main results page, appearing in response to appropriate queries.
Don't forget that AltaVista already has a variety of vertical search tools that you can reach by using the "tabs" above the search box on the results page. Within the AltaVista Tools area, you can also access specialty search features that let you search against US educational sites or US government sites.
Beyond the vertical links, AltaVista has upgraded how it processes search requests, being more expansive with what it returns. Here's a look at the new "query reformulation" changes.
Single word queries are easy -- AltaVista will return any pages it knows of that contain that particular word.
For multiword queries of between two to four terms in length, AltaVista continues to perform automatic phrase detection. This means that it looks to see if there are any recognized phrases that match a dictionary of about 500,000 phrases it maintains. These have been culled by analyzing actual web pages in the AltaVista index. If your search terms appear in the phrase dictionary, then AltaVista automatically translates your request into a phrase search. For example, if you search for:
the request will be turned into a phrase search behind the scenes, even though you didn't specifically request this by placing quotation marks around the terms.
AltaVista has done automatic phrase searching like this since November 1998. However, in a twist that began in February, it now goes beyond providing only exact phrase matches and will also return pages that contain all of the words in your query, even if they aren't contained in an exact phrase. For instance, consider a search for
new york stock exchange
AltaVista will look for pages that match the exact phrase, but then it also finds pages that have all four words on them, even if they don't occur in that exact order.
Before this change, if you had searched for an exact phrase and there were no pages or few pages with that exact phrase, you would have come up with no results or only a few matches. With this change, AltaVista better ensures that its automatic phrase detection, which is helpful in many cases, doesn't leave users without results in some situations.
"The advantage of not removing documents without the phrase is that sometimes the exact phrase isn't exactly how the page describes the concept. For example, the user might be looking for information on Clinton's cybersecurity proposal. The user enters: 'president bill clinton cybersecurity,' said Vaughn Rhodes, senior director of product marketing for AltaVista. "None of the documents discussing the proposal use that specific phrase [so would be missed the old way”. The new AltaVista technique correctly returns those documents."
Finally, in situations where there are five or more words, AltaVista continues to do phrase detection, and then it will seek pages that have ANY of the words on them, rather than ALL of them.
"When users enter a small number of query concepts [such as two to four words”, they are usually looking for documents that have all of the terms in them. However, when large numbers of concepts are used [five or more”, users tend to be in a 'find stuff like what I describe here' mode, in which they don't necessarily require that every term they enter is present," Rhodes said.
This is all pretty tricky stuff. Most of the other major search engines operate simply on an ANY or ALL basis. They either find ANY of the words you request (also called OR processing) or they find ALL of them (also called AND processing). They don't try to detect phrases and alter their ANY or ALL behavior based on word count.
So far, I've covered changes to what AltaVista retrieves. However, it also is looking at the search request differently to help rank the pages it recovers. Remember how phrase detection in the past meant that only a subset of pages with those exact phrases would be retrieved? Now, even pages without the exact phrases will be listed. Webmasters, this means that your pages have a chance of appearing in response to some queries at AltaVista when they wouldn't before, because you didn't use an exact phrase on your page that someone had searched for (either knowingly or because AltaVista automatically performed phrase detection).
However, the ranking algorithm has been tweaked to help ensure that those pages with exact phrases do rise to the top of the results. It's also true generally that after this, pages with ALL of the search terms will be listed, then pages with ANY of the terms. Nevertheless, this general system still can be influenced by AltaVista's other ranking factors.
"It is not entirely accurate that we first list pages with matching phrases, then with all search terms, then with any matches. This ranking scheme is true in a general sense, but the ranking algorithm actually uses a number of different factors to determine rank. If a document with a partial match has a higher connectivity [link analysis/popularity” or page quality measurement than one that matches the exact phrase, it can be ranked higher," Rhodes said.
AltaVista clearly hopes the changes will mean better results for average web users, and most people probably will be fine letting AltaVista do the driving, so to speak. However, advanced searchers may still prefer to control AltaVista themselves by using power commands such as the + symbol or Boolean operators. AltaVista says that if such commands are used, they'll take precedence over the internal logic it tries to follow. That hasn't been the case in the past, and I haven't had a chance to look closely to see if this is indeed happening. In addition, AltaVista's power and advanced search pages have none of the behind-the-scenes processing happening.
"The query document selection technique changes apply only to the main search box, not to Advanced or Power Search. Users of those interfaces won't see any changes. In addition, if users enter into the main search box the syntax elements that were used in the previous technique [such as +, -, or quotes”, we will automatically fall back to the previous methodology," Rhodes said.
By the way, AltaVista's "word count" area at the bottom of the page used to be a good way to determine exactly how AltaVista processed your query. No longer. "The 'word count' area at the bottom of the page shows in effect how the documents are being ranked, but does not necessarily show what words were used in selection," Rhodes said. The "AltaVista's Automatic Phrase Searching" article below explains more about how word count used to operate.
Related to search processing, another relatively new feature is the ability to search within results at AltaVista. After performing a search, just check the "Search Within Results" box under the search box on the results page. Then you can do a new search just against the first set of results retrieved.
Also, be aware that if you get to the bottom of the page and want to see more results, you might accidentally select the "More Sponsored Listings" option, if you aren't careful. That brings up more advertisements from GoTo. Instead, if you want more editorial listings, you need to select any of the "Results Pages" numbers or the "Next" link. These appear near the bottom of the results page and just above the "Sponsored Listings" heading. Webmasters, this also means that even if you aren't in the top three from GoTo, you might find some traffic from AltaVista via your paid links, due to accidental or even intentional clicks in the sponsored area.
Leaving behind GoTo links, here's an update on some LookSmart-AltaVista integration:
Some webmasters noticed that for a time, AltaVista was using LookSmart descriptions for their sites. AltaVista says that its contract with LookSmart no longer allows it to do this. Inktomi does continue to use LookSmart descriptions.
Also, I reported in the past that you might find sites ordered differently in AltaVista's LookSmart-powered directory than at LookSmart itself, when comparing the same categories at both places. For example, look at this at AltaVista:
versus the same category at LookSmart:
AltaVista says the order is mainly related to when sites are added to LookSmart -- presumably, newer sites are coming first. However, this is going to change.
LookSmart now supplies in their feed to us a sorting mechanism which can allow us to sort the categories and web sites according to editor-determined order of relevance. We anticipate implementing this feature in the near future," said Josh Trapp, analyst relations specialist with AltaVista.
By the way, for all those who have asked, AltaVista says it refreshes its listings from LookSmart each day.
In yet more under-the-hood changes at AltaVista, the company says that it now has in excess of 500 million web pages indexed, up from its previous 350 million and in line with the full-text indexes other major crawlers such as Inktomi, Google and FAST. AltaVista generally hopes to revisit these pages at least once per month, if not sooner.
"Our goal for the new process, which we expect to be done by end of month, is to initially revisit every page at least monthly, with an intelligent scheme to revisit top pages or frequently changing pages much more often. However, we expect to quickly improve on that 30 day time frame. The real goal is not really stated as how often we'll revisit every page on the Internet, but probably more accurately stated as having an index that is as close as possible to what is actually on the Internet right now. Some pages are known to never change, so we'd really not need to revisit them. Other pages change daily, so having a two week schedule for those pages would be inadequate," Rhodes said.
In terms of how AltaVista will add brand new pages, things are being revised. For years, the service has operated where any page directly submitted to it would tend to appear within the index in a day or two, assuming the page wasn't flagged as spam and that too many submissions from the same site weren't received in the same day. However, for the past few months, this historical dependability has gone away.
"Normally, we have added URL submissions weekly. Recently, we have incurred a backlog. We expect to deal with the URL submissions we have on hand in the near future and are in the process of putting in place a more streamlined submission system," said Trapp.
What the new system is remains to be determined. AltaVista has long-hinted that paid inclusion is in the works, so don't be surprised to see this emerge. In the meantime, AltaVista says it remains fine to submit around 5 pages per day per web site via its Add URL page, and with luck, you should see these pages appear within a week, if the service goes back to its most recent schedule. Don't forget, the search engine will still gather pages from your site as a consequence of its own crawling independent of the submission queue.
Access to the AltaVista education and government search engines can be found here, on the left-side of the page. The Power Search page is also listed with them.
AltaVista Shopping Advertising Form
Use this form to have AltaVista consider your site for its shopping area. The form assumes that you want to establish a paid relationship. AltaVista also says that sites are considered for inclusion simply on an editorial basis but has yet to explain how to submit your site for consideration if you do not wish to pay. I'll let you know if I get more details on this.
Being Search Boxed To Death
The Search Engine Report, March 5, 2001
Overview of how search blending such as that being done at AltaVista may improve the search experience for users.
AltaVista's Automatic Phrase Searching
The Search Engine Report, February 4, 1999
Describes how the word count feature used to work.