THE SEARCH ENGINE REPORT
November 4, 1998 - Number 24
About The Report
The Search Engine Report is a monthly newsletter that covers developments with search engines and changes to the Search Engine Watch web site, http://searchenginewatch.com/. This month's issue is sponsored by First Place Software and eGroups.com.
The report has 43,000 subscribers. You may pass this newsletter on to others, as long as it is sent in its entirety.
If you enjoy this newsletter, consider showing your support by becoming a subscriber of the Search Engine Watch web site. It doesn't cost much and provides you with some extra benefits. Details can be found at: http://searchenginewatch.com/about/subscribe.html
Please note that long URLs may break into two lines in some mail readers. Cut and paste, should this occur.
In This Issue
+ General Notes
+ AltaVista Debuts New Search Features
+ AltaVista Photo Finder Has Artists Concerned
+ Ask Jeeves: Asking Questions To Give You Answers
+ Lycos Buys Wired, Gets Facelift
+ Direct Hit Begins Netscape Trial, Has New Partners
+ GlobalBrain To Offer Profile Searching
+ AltaVista Northern Europe Looks Gone For Good
+ Search Engine Notes
+ Subscribing/Unsubscribing Info
It's been a busy month, especially with the huge changes at AltaVista. Consequently, many site updates I had meant to do have been pushed back. I expect to be very busy this month adding links and integrating material from past newsletters into the site. Changes will be noted on the What's New page as they happen. And, if you've sent me submissions recently -- relax, I did get them, and now I'm actually looking through them in more detail.
There are updates already within the site, of course. All the Search Engine EKGs now have been updated with data through September 1998. The SpiderSpotting Chart has also been updated.
New results from the latest feedback test can be found on the Search Engine Response Time page. Northern Light was the big winner, with a feedback response of one hour. Several others showed fast responses, also.
Search voyeurs will be pleased to know that the MetaCrawler Top Search Terms page has been updated with September 1998 information.
All the pages above can be found from the What's New page.
When a prospect does a search for a keyword related to your products or services, do you appear in the top 10 or does your competition? Submitting alone does nothing to insure good visibility. WebPosition is the first software product to monitor and analyze your search positions in the top search engines. WebPosition has been compared to similar "services" on the web and has been overwhelmingly voted the best and most accurate tool for search position management.
With WebPosition you'll know your exact positions for an unlimited number of keywords. You'll know if you drop in rank. You'll know when a search engine FINALLY indexes you. You'll know when you've been dropped from an engine. WebPosition is rated 5 out of 5 stars by ZD Net, and includes a 110 page guide to improving your search positions.
Try WebPosition yourself for FREE at:
Search Engine News
AltaVista introduced a number of significant new search features in October, including the display of relevant results from the Ask Jeeves answer service, a new filtering option and a photo search service.
Those performing a standard web search at the service may now receive information from up to four different sources on the search results page, a combination that AltaVista has dubbed "Full View Searching." The hope is that more sources mean better odds that a searcher will come away with relevant results.
Most notable are the new "Ask AltaVista" results, a branded version of the Ask Jeeves service. These results appear in response to many queries, under the heading of "AltaVista knows the answers to these questions."
These Ask Jeeves results gives AltaVista a real edge in terms of relevancy over its competitors, because the results are often exactly on target in response to a query.
A separate article below describes the Ask Jeeves system in more detail. In short, editors have created 7 million questions, which are linked to web pages that provide answers.
For instance, when you search for "software," Ask AltaVista responds with questions such as "Where can I find product reviews for software?" Clicking on the answer button takes you to a site with reviews.
Doing the same search at Ask Jeeves will reveal many more answers. AltaVista is saving screen real estate by listing only the first answer, though in some cases, it will display a "More Answers" link. A better way of getting more suggestions is to phrase your search as a question.
For example, "Where can I buy software?" causes AltaVista to display several questions, such as "Where can I find good deals on new software" and "How can I buy wholesale computers, peripherals & software?"
The second source of search results is from RealNames, through a partnership AltaVista established earlier this year. RealNames is an alternative web site address system, especially meant to deliver people directly to the correct site when doing brand-oriented searches such as "nike" or "barnes & noble." A past Search Engine Report article listed below describes the system in more detail.
Matches from the AltaVista web index make up the third source of information being offered to searches. Here, the service has introduced a number of changes meant to improve the relevancy of its core offering.
First, AltaVista says there has been a relevancy change, especially designed to improve results for simple and popular terms.
"We have totally revamped our ranking algorithm," said Louis Monier, AltaVista's chief technical officer, "Not only for technical questions, but also the simple questions such as cars, real estate or wine."
Monier was closed mouthed about what specifically has been done to improve results, but they do not seem to include non-textual criteria, such as link popularity or boosts for reviewed sites, which has been remarkably effective at Infoseek.
In fact, I ran a quick comparison of the examples Monier cited against Infoseek's results. I felt Infoseek's results for these generic terms were impressive and far better than AltaVista's.
Other behind the scenes changes are also at work, with one of the most important being automatic phrase searching.
Many people would benefit if they did phrase searching. For example, a search for "new york" -- with quote marks around the words to indicate an exact phrase should be matched -- is likely to provide better answers than a search that simply looks to see if any or all of the words appear on a page.
Since most people don't bother with using special commands like the quote marks, AltaVista has made it easier for many by making phrase searching automatic.
"We have built a huge dictionary of several million phrases," said Monier. "Essentially, the need for quotes has gone down to zero. If you look for New York, you don't need to quote New York."
Monier said the phrase detection has been built through a completely automated process, which takes advantage of linguistic patterns to determine what is a phrase, and what isn't.
AltaVista has also introduced spell checking, a first for a major search service, and something that is long overdue. Yahoo has been experimenting with it off and on for several months, but at AltaVista, it's now a full-time option.
"We estimate one in five queries is misspelled," Monier said, such as "virtual" spelled as "vurtual." In other cases, people fail to separate words, such as "networkcomputer" instead of "network computer."
AltaVista won't change a misspelled query. Instead, it offers a prompt on the search results pages with other suggestions. When triggered, they appear just below the search box with the words "Spell check." Spell checking works for queries in English, French, Spanish and Italian.
One change AltaVista could really use is results clustering. Often, the top results seem to be dominated by pages from one site. Do a search for "sports scores," for example, and six out of the top 10 results are from CBS SportsLine. SportsLine operates various domains, and AltaVista is listing the same page under slightly different addresses. Clustering could help improve the chances that one site wouldn't crowd out others like this.
Finally, AltaVista's last source of information is the LookSmart directory. Matching categories from the AltaVista-branded version of the directory appear after web page matches, in a section called "AltaVista Recommends."
These suggestions don't appear in response to all search queries. However, the entire AltaVista directory can be browsed from the home page. Simply start from any topic listed in "Categories," on the left-hand side of the home page, below the search box.
Also on the home page, there are two new search services being offered: filtered search and photo search. Links to both can be found just under the search box, in the "Specialty Searches" section.
AltaVista Family Filter is designed to protect children from seeing objectionable material when searching on innocent topics. For example, a search for "toys" may list pages offering sex toys, complete with graphically explicit page titles and descriptions. AltaVista Family Filter aims to prevent this from happening.
"Once you are in filtered mode, you should not see content from objectionable sites," said Monier.
AltaVista is the second major crawler-based search engine to offer this function. Lycos launched its service, SafetyNet, earlier this year. A third crawler-based service, Searchopolis, is currently in beta. It is powered by Inktomi.
AltaVista's filtering works in three ways. First, the spider tags pages as objectionable, if it finds certain words and phrases used in particular ways. Second, it then uses a filtering process developed in partnership with SurfWatch to catch anything that made it past the spidering identification. Finally, AltaVista allows users to report on any pages that may have slipped through the first two barriers.
Additionally, you should be able to still discover pages that contain sexual terms but which are not pornographic in nature, according to Monier. "It can distinguish between a medical page and a smut page," he said.
For example, a search for "breast cancer" should bring up relevant results about the disease and treatment, but not porn pages, when filtering is on.
To enable filtering, follow the link from the home page. Choose the "Start" button on the next page that appears. When enabled, a small red + symbol will appear in the page header area, to remind you that filtering is active.
AltaVista's new Photo Finder is an impressive new service. Photo search has sort of died away as a major offering with the search engines. Until now, only Lycos was still offering a dedicated crawler-based service. Now AltaVista has entered the scene.
"It is the largest index of images on the web. We have over 11.5 million pictures, most of them obtained by crawling the web," said Monier.
Lycos might dispute the title of largest, as it claims to have 18 million images indexed. But where AltaVista wins hands down is usability. Its results contain thumbnail pictures of images it has found, where Lycos displays only text descriptions. Consequently, it is much easier to spot images of interest at AltaVista.
Not everyone is happy with the thumbnail display, however. While it is great for searchers, some photographers and artists are concerned that AltaVista is violating their copyright by making copies of their pictures without explicit permission, in order to present the thumbnails. See the article below for more details on this issue.
Only about 1/2 million of the pictures in the AltaVista service come from its partnership with Corbus. The rest have been identified by AltaVista's spidering system. In particular, it's the spider hosted at vscooter.av.pa-x.dec.com that many people have seen in recent months that has been doing the work.
AltaVista's image identification system operates essentially like that at Lycos. Images are associated with words that appear in their ALT text, file names and on the page. A search looks for matches in this text, not within the images. Monier added that some more sophisticated things are also being done beyond this basic operation, but he declined to provide more details.
One sophisticated thing that AltaVista does offer beyond Lycos is the ability to do a true image search independent of text association. Users can ask to see more pictures that are visually similar to any picture selected.
AltaVista has also introduced one other new feature, meant to make life easier for repeat users: a shorter address. The service can now be reached at http://av.com/.
Real Name Tops At AltaVista
The Search Engine Report, June 3, 1998
Kids Search Engines
More services meant to be safe and friendly for children. A link to Searchopolis, mentioned in the article, can be found here.
A new, Inktomi-powered filtered search engine, currently running in beta.
Lycos Pictures and Sounds
The Lycos multimedia search service. It features 40,000 images organized by category, from the PicturesNow catalog. You can browse categories and view thumbnails of these pictures. Search mode lets you scan the web for pictures or sounds of interest, but no thumbnails are provided.
The introduction of AltaVista's new Photo Finder service has raised concerns with some photographers and artists that AltaVista is violating their copyright by making copies of their pictures without explicit permission.
Similar concerns have been raised in the past by webmasters concerned that search engines are indexing their textual content without permission. So far, there's never been a legal suit against any of the search engines regarding this. For the most part, webmasters want traffic, so they don't complain about indexing. Additionally, search engines generally say that they are not presenting entire documents, only summaries, and so they don't fall afoul of copyright violations.
Finally, the robots.txt file and the meta robots tag exist as a way for webmasters to explicitly exclude their content. They work as a de facto automatic copyright notice for spiders. My suspicion is that if there ever was a court case, use of these mechanisms as a "tell us no, otherwise it is OK" permission device would be supported. If not, crawler-based services might find their systems shut down, which would have a negative impact on the entire Internet community.
The situation is different in the image search case. Search engines give just a summary of text documents, but AltaVista is actually displaying complete copies of pictures in its image search results, albeit in low-quality thumbnail form.
This leads into a chief concern that the catalog gives the impression that images displayed may be used freely.
"In all honesty, this is my main complaint about Photo Finder: they've made it too easy for folks to steal people's web graphics, and there's already too much of that. In fact, by so freely reproducing the works of others, they appear to endorse this behavior as a company, in effect, leaving their visitors with the impression, "it's OK to put these on your page, too," said photographer Reid Stott, who has posted a page on the issue for fellow artists.
AltaVista's Chief Technical Office Louis Monier said the service is now working to ease concerns.
"As you can imagine, we are not trying to steal anything or encourage piracy. This is a first, so we were expecting some reactions and are taking action to make the Photo Finder useful for everyone. We are adding the proper disclaimers, and instructions for opt-in/opt-out," he said.
AltaVista has already added a notice to the Photo Finder home page, telling searchers they should seek permission to use the images displayed. Stott and others think this should be more strongly worded, and that it should also appear on results pages, in association with the thumbnail images themselves.
AltaVista has also made another change, in the wake of complaints. Previously, users could click on a thumbnail and download the image without ever visiting the host web site. Now, this is no longer possible.
However, if a user goes through the extra step of choosing "About This Image," then clicking on the "Image" link, they can still download an image without visiting the host site. A similar situation occurs if they view results in "verbose" mode. It is a simple step to plug this last hole, so I wouldn't be surprised to see it changed in the future.
Last, but not least, the fact that the images are being used without explicit permission upsets Stott and other artists.
"I believe it could be both a useful and non-controversial resource. If they'd placed the service online with all those Corbis images, and then allowed people to 'opt-in' to the service through a 'submit site' link, you'd never have heard a peep from me," Stott said. "This would have honored the spirit of copyright law by, in effect, asking permission for the use of copyrighted works in advance, by allowing the content creator to make a choice about the use of their work. It may seem a fine distinction, but it goes to the heart of copyright law: you must obtain permission, in advance."
The big problem with this is that many people would fail to explicitly submit their images, just as web authors already fail to properly submit their web sites or make use of meta tags. That would reduce the effectiveness of the current catalog, which ironically, is an ideal way for artists to discover if their images are being used without permission on the web.
Of course, the real benefit is to web searchers, and it is reasonable to assume that those web searchers are looking for images to use in their sites or elsewhere. This leads back to the issue that the service may encourage piracy.
For artists that wish to protect their images, the robots.txt file option is available, as are two new meta robots tag options that AltaVista has introduced specifically for those with image indexing concerns. These are useful for those unable to use a robots.txt file. Both are described on a special page AltaVista has created to help webmasters understand how to opt out of the service.
Unfortunately, there may be a delay between the time the blocks are installed and when AltaVista actually drops the images. Here, the service made another stumble. It has run its image spider over the past few months without publicly identifying what the spider was for. That meant artists had no idea that this was a spider they may have wanted to block, and so they have been included by default.
One unlikely change is removal of the thumbnails. AltaVista considers them acceptable use that doesn't require special permission, just as a summary of a web page might be considered acceptable use.
"The thumbnails are not the issue (they are an acceptable 'citation'). It's the fact that we linked to the picture with no context that displeased a few people: we now give access to the page, so users can see the entire context. And, of course, we respect the robots.txt file and any request to have images removed," Monier said, via email.
Whether a court case results remains to be seen. The onus is left to artists to block the AltaVista spider, just as with web page authors. The difference is that complete works are being presented, and that the artists receive far less gain from this than web authors do. Thus, they may have more desire to pursue an action.
AltaVista Photo Finder
AltaVista Photo Finder, and how to keep your images "unfound"
Reid Stott's summary of how to keep your images out of the AltaVista Photo Finder service.
AltaVista Help: Excluding Pages From Photo Finder
Instructions from AltaVista on excluding your pages from Photo Finder with a robots.txt file or the meta robots tag.
Compaq Accused Of Copyright Infringement
Newsbytes, Oct. 27, 1998
At least one artist has sent a formal complaint to AltaVista regarding Photo Finder.
News Robot Leads To Linking, Indexing Dispute
The Search Engine Report, Jan. 9, 1998
Discusses how the robots.txt file may be crucial in any indexing dispute that eventually winds up in court.
One of the biggest problems that search services face is the fact that people often search too broadly. For example, they enter something like "travel" and then expect relevant results.
Travel what? Travel agents? Places to book airline tickets? Travel guides? If a search engine could talk, it would ask these questions in order to understand exactly what a person is seeking.
Enter Ask Jeeves. The service does an impressive job of getting people to what they want by asking questions.
For example, imagine you want information about cars. Enter "cars" into the Ask Jeeves search box, and the service comes back with questions like:
+ Where can I find product reviews for cars?
+ Which models of cars are most frequently stolen?
+ Where can I locate information on the history of automobiles?
In front of each question is a Go icon. Choosing it takes users to a web site that answers the question.
The secret to the accuracy of Ask Jeeves is human intervention. About 30 people work full time creating the knowledge base of questions, which currently numbers about 7 million. They come up with ideas on their own, especially for popular topics, and they also watch what people are actually searching for.
"One of the really wonderful things about our site is a significant portion of what people input is indeed questions," said David Warthen, cofounder of Ask Jeeves and its chief technical officer. "We've got a very active feedback loop going."
Technology also plays an important role in helping Ask Jeeves provide relevant matches. Do a search for "white house," and notice the questions that come up. Among those that provide White House information is "Where can I find special reports on the news topic Monica Lewinsky?"
The words "white house" aren't mentioned in that question, yet many people might find it relevant for a search on "white house." How is it happening?
At Ask Jeeves, questions are associated with concepts, and concepts are in turn related to each other. So, questions about the White House may be linked to words like "monica lewinsky." Likewise, a search for "kenneth starr" might be associated with "monica lewinsky" and "bill clinton."
These relationships cause unexpected yet often relevant questions to appear. A further twist is that relationships between concepts can grow or diminish depending on what users select.
For example, if many people searching for "white house" choose the Monica Lewinsky question, then that relationship is strengthened. More questions relating to her and the current political scandal may then appear for "white house" searches in the future.
In contrast, a year from now people may be looking for other information when searching for "white house." The relationship with Monica Lewinsky would weaken (no pun intended), and fewer questions relating to her would be presented.
Human editors also review relationships. They can upgrade or downgrade links in response to changing search patterns.
To get the most of Ask Jeeves, be specific and phrase your search as a question. Don't be afraid to ask it exactly what you want -- you'll probably be surprised to find it has an answer.
Of course, Ask Jeeves doesn't know everything. As a back up, it has a metacrawler component. Top results from various search engines appear below Ask Jeeves's own information, or if Ask Jeeves has no information, then metacrawler results appear at the top of the page.
Another feature users may like about Ask Jeeves is its spell check component. If it suspects a mistake, it will say, "I think you may have misspelled something" above the search results, sometimes accompanied by alternative spellings.
A fun feature on the home page is a real time display of actual questions people are asking.
Ask Jeeves has been online since the spring of last year, but its new partnership with AltaVista (see article above) will expose many more people to the pleasure of having good answers. But those looking for more should also consider making Ask Jeeves a first stop.
eGroups.com makes it easy for anyone to start an email group.
It's flexible and FREE, and a great way to keep in touch with:
*co-workers *family *colleagues *alumni *teams
*people interested in almost any topic *customers
eGroups.com: Join more than one million members in discovering
the power of free email groups at......http://eGroups.com
Lycos acquired both a network of new web properties, including the HotBot search engine, and a new look in October.
Lycos announced an agreement on Oct. 6 to acquire Wired Digital, which owns HotBot, HotWired, Wired News, Webmonkey and Suck, through a stock swap valued at US $83 million.
Devoted followers of HotBot needn't worry about dramatic changes, at least for the near future. Plans are to run HotBot as a separate service, especially as it appeals to a different audience than does the main Lycos service. Lycos says there is only a 20 percent overlap in audiences between HotBot and Lycos.
Also, due to an existing multiyear contract, HotBot will continue to be powered by Inktomi. In short, HotBot should continue to be everything that you've liked.
On the heels of the announcement came an overhaul of the Lycos service. This is the latest in a series of significant redesigns Lycos has undergone this year, so users can be forgiven if they feel a bit lost. To provide some guidance, here's a rundown on elements you may discover on the search results page.
Along the left-hand side of the page is a navigational guide, which is designed to highlight content within the site, such as site guides and message areas. These will often be relevant to a particular search topic.
In the main part of the page, paid "Bullseye" links continue to come first, in response to popular search terms such as "travel." After this, a new "First and Fast" section may appear, where Lycos displays links to information it believes will be appealing. Links include content partner sites, pages within the Lycos service and offsite editorial picks.
"We're trying to categorize popular links and known content right at the top," said Lycos product manager Rajive Mathur.
The Matching Categories section comes next, where relevant Lycos Community Guides are listed. These are Yahoo-like lists of web sites, created using automated technology and user feedback. They are one of the best parts of the Lycos service, and I find the quality of picks is often far better than the Lycos web search results.
There are guides for all sorts of categories, such as budget travel, Windows 98 and gender studies. An important change is that Lycos is now listing fewer categories in its search results overall. Previously, a search for "travel" might bring up 10 or 20 travel-related categories. Now, things are designed so that mostly upper level categories are returned, the service says.
Lycos has made the change so that only a few categories are listed, so as not to overwhelm searchers. Lycos has also added descriptions to the categories, so that people know what to expect if they select a link.
"We want to have less links, but more relevant links," Mathur said.
Remember, you can always browse the community guides, in order to get a better idea of all categories offered. Top level categories are listed on the home page, just below the search box. Simply click your way down to an area of interest.
One glitch in the More Categories section is that Lycos is also listing some content that is not from its Community Guides, such as matching Yellow Pages content. Mathur says Lycos is working to correct this.
Below More Categories comes the Check This Out section, where Lycos content partners are given prominent links. Following this are matching web pages from the Lycos index.
The most significant change in the web search results is that Lycos has now instituted clustering. This means that only one page per web site appears in the top listings. The change leaves only AltaVista, Excite and Excite-owned WebCrawler of the major services not to do clustering by default.
Clustering is good, because it means that more sites have an opportunity to be represented in the top results. That offers searchers more choice. However, a flaw with the Lycos system prevents this from happening.
For example, imagine that a web site has four pages in the top results. With clustering, only one page will be listed -- allowing room for three pages from other web sites to be represented. Unfortunately, this doesn't happen. Instead, the search results simply grow shorter. No new sites move up to occupy the "vacant" spots.
Clustering can't be turned off, but the "More Pages" link below each listing allows the presentation of other page from that web site.
Also relating to web page results, I mentioned last month that the Lycos index seemed dated. It still seems woefully stale. Some new sites submitted nearly three months ago that I know of have yet to appear, and I've received similar complaints from various readers.
Mathur says that the index is constantly being refreshed, and that another catalog update should occur shortly. So we'll all watch and see if things do improve.
Moving over the home page, users will find "Search Options" and "Kids Safe Search" as choices prominently listed at the top of the page.
Kids Safe Search allows users to enable the Lycos SafetyNet filtering mode, which is intended to allow kid safe searching. A past Search Engine Report article below describes this in more detail.
Search Options takes users to the Lycos Pro service, where advance queries can be performed. One relatively new feature, added about four months ago, is the ability to search by language. Lycos identifies pages at the time of spidering and categorizes them into one of 15 languages.
HotWired About To Encounter Its Hard-Charging New Master
Internet World, Nov. 2, 1998
In depth look at Lycos CEO Bob Davis. Excellent reading.
Wired Digital Is Bought by Lycos
Internet World, Oct. 12, 1998
Wired Finds New Network Home
Red Herring, Oct. 9, 1998
A look at how the Wired Digital properties fit into plans for Lycos to establish a network of sites, in comparison to how acquisitions have gone with Lycos rivals.
Kid-Friendly Searching From Lycos, Disney, Ask Jeeves
The Search Engine Report, July 1, 1998
Learn how the Lycos SafetyNet filter works.
Lycos Community Guides Get Comeuppance
The Search Engine Report, April 30, 1998
Describes how the Community Guides are created.
In a few months, the backers of GlobalBrain.net hope to bring profile-based searching to the web, through a deal with one of the major search services.
GlobalBrain is technology that uses clickthrough data combined with user data to deliver highly targeted search results. It has been developed by Dr. Grant Ryan and a team of programmers in New Zealand, and is being marketed to search firms by a Bay Area business development consulting company called Double Impact.
"Based on the incredible interest and feedback we've received from test users, Portal CTOs and CEOs and industry analysts, we expect to solidify a major exclusive relationship and release the technology to the public within a few short months," said Double Impact principal and GlobalBrain board member Steven Marder.
At the core of GlobalBrain is a system of measuring what users select from search results. That sounds like Direct Hit, which currently has a partnership with HotBot. However, GlobalBrain is not a copycat system attempting to cash in on Direct Hit's success. I heard about both technologies at nearly the same time, about six months ago, before either launched publicly. It seems very much the case of a good idea -- counting clicks -- being developed independently.
Both systems watch to see which pages users actually select from among the search results. These pages are considered better than others, especially if a person spends much time viewing them. In this way, searchers invisibly vote for the pages they like.
Direct Hit uses this clickthrough data to rerank HotBot's "normal" listings, and it often produces more relevant results. HotBot users tap into Direct Hit by performing a search, then selecting the "Top 10 Most Visited Sites" link that appears above the HotBot results.
GlobalBrain steps beyond Direct Hit with its concept of profiles. GlobalBrain asks searchers to voluntarily register themselves by country, age, occupation, favorite sport and gender. This user data is then linked to clickthrough data. This allows GlobalBrain to deliver results targeted to particular profiles, an extremely powerful feature.
For example, a person living in the United States will have a completely different concept of what is relevant for "football" than someone who lives in the United Kingdom. With profiling, GlobalBrain can intelligently deliver results about American football to the US surfer, while the UK surfer gets information about what an American would call soccer.
Likewise, profiling can be done by interest area. At Excite, a search for "giants" brings up special information about the New York Giants football team. That's a great enhancement, unless you are a fan of the San Francisco Giants baseball team. With GlobalBrain, someone who indicates that their favorite sport is baseball, or that they live in San Francisco, would be more likely to get information about the Giants baseball team, rather than the football team.
GlobalBrain also offers an innovative keyword suggester that leverages what people click on to build a thesaurus of related terms.
For example, imagine someone searches for "bicycling" and then selects a page from the results that are displayed. They may then search for "cycling." Because the two keywords were entered one after the other, the system assumes they may be related. The next time someone searches for bicycling, it will then suggest "cycling" as an alternative.
The system also measures which of the suggested keywords users click on. This helps it learn which keywords are best related to each other, in the same way it learns which web pages people find useful.
Currently, Excite is most advanced in this area of suggesting alternative words. It watches for links between words that appear on pages, rather than within searches. For example, if it sees many pages where Apple appears with computer, it may then suggest computer as a word that should be included in the search.
The GlobalBrain solution is not necessarily better than what Excite does, but rather another approach toward achieving the same goal.
GlobalBrain also provides automatic bookmarking, which is probably more understandable as saved searches. The system will show you past searches, along with the link you visited and found most relevant for each keyword. This bookmarking is also used to help refine search results to an individual level.
The data gathered through bookmarking means that two people with identical profiles (country, sport interest, occupation, etc.) might still get two different sets of results in response to a search, because the system is tailoring matches to what they've selected in the past.
One possible concern with this automatic mechanism is the fear that searches could be tied to an individual. I've long been expecting for some journalist to track down what a politician is searching for while at work, in the way that some have sought video rental records.
It's theoretically possible to do this now, but only with a great deal of difficulty. For example, personalization services at Yahoo and Excite, among others, allow for search topics to be saved. This makes it easy for a user to run the search in the future -- you just click on the saved link.
Someone who could access this personal information might be able to discover if some embarrassing searches are being stored. However, the first problem is that the companies aren't likely to release it willingly. In fact, several of them have just pledged ad space to promote privacy issues.
"We would never communicate data on an individual level," said Excite search product manager Kris Carpenter.
Second, it would be difficult to conclusively prove that any individual actually created an account, Carpenter pointed out. After all, no positive identification is required to open a personalized account.
For example, someone could go to a coworker's computer, quickly create a fake account, then store some embarrassing searches. The computer could be linked to the account via a cookie, but there's no way to prove that the computer's owner actually created the account.
Beyond these problems, the main difficulty in gathering this data would be that most people probably have not bookmarked their searches. Someone who actually managed to force their way into an account by some means would likely discover nothing incriminating there.
This is why GlobalBrain's system might cause a concern with some people. It would store searches automatically. All the other hurdles in getting this information would remain, of course, but people can still be hypersensitive about privacy. To address this, GlobalBrain's solution would be offline storage of searches.
"Privacy has been key concern when designing the GlobalBrain system, hence there will be an option to download software that stores the personal bookmark information on a users computer, said Ryan, "The only person that will ever have access to this data is the person that created it"
In addition to the above features, GlobalBrain also provides an ability to sort pages into different groups, such as "popular" or "high flyers." Popular displays pages just as Direct Hit does, where pages are ranked in order of popularity based on clickthrough and time viewed. High flyers list pages in order of those with increasing popularity. A "New" option shows the newest pages that have been found relating to particular terms. "Past Preferences" shows results based on a user's individual preferences.
By default, GlobalBrain results show a mixture of the four groups. In this way, relevancy is kept high, yet an element of discovery is also mixed in.
GlobalBrain is taking a different strategy with its technology than Direct Hit. That company is offering its technology to anyone, through branded partnerships. In contrast, GlobalBrain has been talking with various search engine companies to find a single partner to purchase its technology.
GlobalBrain says that two companies are currently testing its system, to see how it works in the real world, and a third is about to evaluate it. Ultimately, it hopes the system will launch with one of the companies within a few months.
"The system has worked so well in our extensive trials that we are very excited about launching it and providing users with a more satisfying and relevant search experience," said Ryan.
As for Direct Hit, it sees any deal that GlobalBrain may cut as strengthening demand for its technology.
"I think GlobalBrain's entry into this field validates the Direct Hit technology of user-based popularity ranking as the best method of organizing Internet search results, as compared to link-based methods," said Gary Culliss, chairman and founder of Direct Hit.
Additionally, Direct Hit has been developing its own profile-based searching system. As with GlobalBrain, it would deliver results tailored to particular types of people, based on their profiles.
Privacy ad campaign to launch
News.com, Oct. 7, 1998
Direct Hit, which debuted on HotBot in August, is now running with two additional partners and under trial at Netscape.
At Netscape, Direct Hit results are now being offered in response to some of the service's more popular search terms, such as "beanie babies" and "horoscopes."
Netscape is testing the technology over the next two weeks, and Direct Hit CEO Mike Cassidy says that feedback so far is running 10 to 1 in favor of the service becoming a permanent addition.
If Direct Hit results are available for your search at Netscape, you'll see a "Top 10 Sites" link appear in the "Start Here" box that appears about many search results.
Direct Hit is also available to Mac users running Apple OS 8.5, as a plug-in to the new Sherlock search utility. See the link below for more information.
Users of AOL's ICQ messaging service will discover Direct Hit will be launching there in beta shortly. Watch the Direct Hit site for more details on how the system will work with the ICQ client.
Direct Hit for Apple OS 8.5
Counting Clicks and Looking At Links
The Search Engine Report, Aug. 4, 1998
More about how the Direct Hit system works.
I still haven’t gotten a straight answer from AtlaVista regarding the closure its former Northern European site -- and lots of people keep asking. Here's the official line so far from AltaVista:
"The change in the AltaVista Northern Europe site is reflective of AltaVista's shift away from licensing its search technology. Instead, AltaVista is focusing on building and leveraging the AltaVista global brand to provide users with the most satisfying and comprehensive Internet experience."
I'm interpreting this to mean that AltaVista isn't going to renew the mirror site arrangement with Telia, either because AltaVista has changed its strategy, or perhaps -- as AltaVista's support staff messaged one reader -- because Telia decided to no longer partner with AltaVista.
Perhaps it was even a mutual decision, but whatever the case, fans of that former site shouldn't get their hopes up that it will return.
The entire situation has been confusing, in that over the past month, AltaVista has continued to list the site as part of its search network. But then, anyone following the link and trying to do a search gets a message that the service has been discontinued and is directed back to the main site.
For example, Danish users are told that the site is no longer running, and "the reason is that AltaVista and Scandinavia Online are working together to make a new and better search engine. You will get both AltaVista's great global index, but also Kvasirs huge Danish catalog."
Swedes are being directed to another search engine, and a similar situation may be happening with other languages.
AltaVista Northern Europe
The site's still up -- but no one is home.
A search engine for those in Denmark, Norway and Sweden (URLs in that order) that appears to use AtlaVista technology to custom crawl Scandinavian sites.
European Search Engines
Even more alternatives for everyone in Europe.
Search Engine Notes
Searchopolis Offers Filtered Search Option
N2H2's filtered search service, Searchopolis, went up in beta in October. The service is powered by Inktomi. I expect to bring you more details in a future issue about the technology involved.
Privacy Concerns About Smart Browsing
Researchers at Interhack have posted a document outlining concerns about information that Netscape's What's Related feature reports back to Netscape's servers. They describe how it is possible to determine the various web pages an actual individual has visited, which could possibly be abused. They say a similar situation is true for those using Alexa, whose technology Netscape uses for its What's Related feature. The authors make no accusation that the information is being abused, but rather they point out that the possibility for abuse exists.
What's Related: Everything But Your Privacy
Smart Browsers Ease Searching
The Search Engine Report, July 1, 1998
More details on Netscape's Smart Browsing features.
Yahoo Buys Yoyodyne
Yahoo announced it will purchase target marketing firm Yoyodyne in a $30 million stock swap.
Yahoo Buys Marketer, Hoping To Build Ad Base
Internet World, Oct. 19, 1998
Excite Partners With Netscape Internationally
Excite announced on October 14 that it will be powering Netscape's country-specific search services for Australia, France, Germany, Italy, Japan and the United Kingdom. Excite will also be developing content for several Netcenter channels aimed at these countries.
MSN Out Of Beta
MSN's redesign lost its beta moniker on October 7. Changes went live on the site that day.
Also, last issue I said the name Microsoft Network was now being used for the separate Microsoft Internet access service. In fact, this service is called MSN Internet Access.
Microsoft Unveils MSN Search
The Search Engine Report, Oct. 5, 1998
Details about the improved MSN and Inktomi-powered MSN Search.
Search Engine Articles
A Small Quarterly Profit for Excite, But Investors Are Looking for Consistency
Internet World, Nov. 2, 1998
Excite finally shows a profit, but it still has a way to go.
Highlights Of Online World 98
Mining Company Web Search Guide, Oct. 19, 1998
Unfortunately, I wasn't able to attend Online World 98, a gathering of research and library professionals that featured a panel of search engine product managers. Fortunately, Chris Sherman of the Mining Company's Web Search guide did attend. Read his write-ups on important panels and tips from the show.
Net search engines find TV ads
San Francisco Business Times, Oct. 12, 1998
A nice summary of the current round of search engine television ads that are underway.
Netscape Bets Portal Users Will Pay to Play
Industry Standard, Oct. 8, 1998
Netscape is planning to charge for extra services offered through its portal. Will it work, and might others follow? Certainly, if successful. That remains to be seen. Article provides some details of Netscape's plans.
How do I unsubscribe?
+ Use the form at: http://searchenginewatch.com/sereport/unsubscribe.html
How do I subscribe?
+ Use the form at: http://searchenginewatch.com/sereport/
How do I see past issues?
+ Follow the links at: http://searchenginewatch.com/sereport/
Is there an HTML version?
+ Yes, but not via email. View it online at:
I didn't get Part 1 or 2. Can you resend it?
+ No, but you can view the entire issue online, via the link above.
How do I change my address?
+ Unsubscribe your old one, then subscribe the new one, using the links above.
I need human help!
+ Write to [email protected] DO NOT send messages regarding list management issues to Danny Sullivan. He does not deal with these.
I have feedback about an article!
+ I'd love to hear it. Use the form at http://searchenginewatch.com/about/contact.html.
This newsletter is Copyright (c) Mecklermedia, 1998
Introducing... ClickZ Live!
SES Conference & Expo has merged with ClickZ to bring you ClickZ Live! The new global conference series takes on the identity of the industry's premier digital marketing publication, ClickZ.com, and kicks off March 31-April 3 in New York City. Join the industry's leading tech-advertisers in the advertising capital of the world! Find out more ››
*Super Saver Rates expire Jan 24.