THE SEARCH ENGINE UPDATE
March 20, 2000 - Number 73
About The Update
The Search Engine Update is a twice-monthly update of search engine news. It is available only to those people who have subscribed to Search Engine Watch, http://searchenginewatch.com/. Please note that long URLs may break into two lines in some mail readers. Cut and paste, should this occur.
In This Issue
+ About The Search Engine Watch site
+ Search Engine Strategies Conference
+ Submitting To Search Engines & Encouraging Crawlers
+ Microsoft Takes Stake In RealNames
+ Google Adds Directory
Search Engine Articles
+ The usual round up of interesting articles relating to search engines.
+ Subscribing/Unsubscribing Info
Within the main site, I've reorganized the Search Engine News page and added a new daily news feed of search engine articles. This may also be made available in email form, in the near future. Also, a new "Search Engine Index" page provides a compilation of interesting search engine-related statistics. Links to both can be found via the What's New page:
The next Search Engine Strategies seminar is just over a month away. It will be held on April 27, in London, and conference details are online at the URL below. The one day conference will feature both experts on search engine marketing issues and panelists from search engines, including confirmed speakers from Inktomi, LookSmart and Voila. I'll be presenting and moderating throughout the day, and the conference will also have a look at regional and language issues.
Search Engine Strategies London 2000
Search Engine Strategies Coverage
Earlier this month, I spoke at and hosted the Search Engine Strategies conference in New York. If you couldn't make it, I'd recommend reading the series of articles written by About.com's Chris Sherman, listed below. Sherman covered the last conference held in San Francisco, last November. Many of the same speakers spoke in New York, so the articles are still applicable.
Of course, new information did come up in New York, so I thought I would highlight some particularly interesting points. Additionally, links to articles and comments about the New York conference are also below.
+ In my presentation, I talked about the advantages of breaking up large web sites into a series of smaller sites, as a means of encouraging crawling. A separate article on this subject is below, in the newsletter.
+ Speaker Shari Thurow, of GrantasticDesigns.com, noted that getting listed with directories, such as with the Open Directory, seems to help increase your rankings with some crawler-based search engines, especially Excite. Others I have talked with report similar experiences. This makes it all the more important that you build quality sites that will please editors, so that they will consider listing you.
+ Thurow also discussed that creating FAQ pages about particular topics provides your site with both great material for visitors and pages that tend to do well with search engines.
+ I-Search moderator Detlev Johnson presented the results of a survey on submission habits. It found that most web marketers who responded prefer to submit manually to search engines rather than use auto-submission tools. Most marketers said that this was because they "know" manual submission is more effective. Most respondents also said that they optimize their web pages for crawler-based search engines and submit only "when necessary." Those who do use auto-submit tools preferred software packages over web-based services. About 300 I-Search readers participated in the survey -- a link to full survey results is below.
+ Kate Wingerson, Editor In Chief of LookSmart, said that it is acceptable to use the new Express submission service to submit subpages from within your site, as long as they have content relevant to the category you select.
Search Engine Keyword Buys Are Essential Campaign Element
InternetNews.com, March 9, 2000
Highlights of the panel on buying banner ads and link placement at search engines.
Search Engine Strategies 99: Special Report
About.com Web Search Guide, Nov. 29, 1999
Comprehensive coverage of the November conference is available here.
Search Engine Marketers Prefer Manual Submission to Auto-Submit Tools
SearchEngineWatch.com, March 20, 2000
Results of the I-Search submission survey.
Submitting To Search Engines & Encouraging Crawlers
In an ideal world, you would never need to submit your web site to crawler-based search engines. Instead, they would automatically come to your site, locate all of your pages by following links, then list each of these pages. That doesn't mean you would rank well for every important term, but at least all the content within your web site would be fully represented. Think of it like playing a lottery. Each page represented in a search engine is like a ticket in the lottery. The more tickets you have, the more likely you are to win something.
In the real world, search engines miss pages. There are several reasons why this may happen. For instance, consider a brand new web site. If no one links to this web site, then search engines may not locate it during their normal crawls of the web. The site essentially remains invisible to them.
This is why Add URL forms exist. Search engines operate them so they can be notified of pages that should be considered for indexing. Submitting via Add URL doesn't mean a page will automatically be listed, but it does bring the page to the search engine's attention.
Add URL forms have long been an important tool for webmasters looking to increase their representation through "deep submitting," which I'll cover below. But it is also important that you consider site architectural changes that can encourage "deep crawling." These architectural changes should keep your site better represented, in the long run.
In the past, there was a strong relationship between using Add URL forms and pages getting listed. Pages submitted via Add URL would tend to get listed and listed more quickly than pages that were not submitted using the forms. For this reason, people often did "deep submits." They would submit many pages at the same time, hoping to get them all listed quickly. In fact, Go (Infoseek) even had a system where you could email thousands of URLs, all of which would be added within a week.
Those days are essentially gone. In my opinion, there is very little value to most people spending much time on deep submits. That is because many search engines have altered the behavior of their Add URL forms in response to spamming attempts.
Go is a good example of this. Until last November, any page submitted to Go via the Add URL form would appear within a day or so. Now, only "root" URLs are accepted. For instance, if you submitted all these URLs:
Go would simply shorten them to this core URL:
It would then visit that URL and follow links from it to other pages, deciding on its own what to gather.
Deep submits can still be effective in some places. For instance, AltaVista will list any page submitted within a day or two. Therefore, submitting to AltaVista directly can increase the representation or freshness of your listings. However, AltaVista also considers excessive submission to be spamming. Submit too many pages per day, and you may find yourself locked out of the Add URL form. Even worse, you might find all your pages removed. AltaVista doesn't publish a submission limit, but staying under five pages per day is a good rule of thumb.
Inktomi is another place where deep submits still seem effective. In recent weeks, I've noticed that submissions made to Inktomi via HotBot's Add URL form have appeared within two weeks, if not sooner. In fact, Inktomi has suggested that pages submitted using Add URL will be "tested" within its index for a short period of time. If the pages seem to satisfy queries, then they may be retained. As for limits, HotBot will allow you to add up to 50 pages per day.
Excite has suggested that it, too, will operate a system similar to Inktomi's, where submitted pages will be tested for a short period of time. However, I've seen no evidence of this actually happening. For that reason, I don't suggest wasting your time on a deep submit to Excite, which allows the submission of 25 URLs per week, per web site.
Finally, Lycos is probably the last place you might want to be concerned about doing a deep submit. It has always shown a tendency to more likely list pages submitted to it. Lycos has no submission limits, but I would suggest staying under 50 URLs per day.
Encouraging Deep Crawling
It's important to remember that even if you don't submit each and every one of your pages, search engines may still add some of them any way. Crawlers follow links -- if you have good internal linking within the pages of your web site, then you increase the odds that even pages you've never submitted may still get listed.
In fact, some search engines routinely do "deep crawls" of web sites. None of them will list all of your pages, but they will gather a good amount beyond those you actually submit. Currently, deep crawlers are AltaVista, Inktomi, FAST and Northern Light. And even non-deep crawlers will still tend to gather some pages beyond those actually submitted, assuming they find links to these pages from somewhere within your site.
However, even the best of the deep crawlers will have problems with large web sites. This is because crawlers try to be "polite" when they visit sites and not request so many pages that they might overwhelm a web server. For instance, they might request a page every 30 seconds over the course of an hour. Obviously, this won't allow them to view many pages. Other crawlers are simply not interested in gathering every single page you have. They'll get a good chunk, then move on to other sites.
For this reason, you might want to consider breaking up your site into smaller web sites. For instance, consider a typical shopping site that might have sections like this:
The first URL is the home page, which talks about books, movies and music available within this site. The second URL is the book section, which contains information about all the books on sale. The third URL is the movie section, and the fourth is the music section.
Now imagine that three main sections have 500 pages of product information each. Altogether, that gives the site about 1,500 pages available for spidering. Next, let's assume that the best deep crawler tends to only pick up about 200 pages from each site it visits -- this number is completely made up, but it will serve to illustrate the point. This would mean that only 250 pages out of 1,500 pages are spidered, or 17 percent of all those available.
Now it is time to consider subdomains. Any domain that you register, such as "site.com," can have an endless number of "subdomains" that make use of the core domain. All you do is add a word to the left of the core domain, separated by a dot, such as "subdomain.site.com." These subdomains can then be used as the web addresses of additional web sites. So returning to our example, let's say we create three subdomains and use them as the addresses of three new web sites, as so:
Now we move all the book content from our "old" web site into the new "books.site.com" site, doing the same thing for our movies and music content. Each site stands independently of each other. That means when our deep crawler comes, it gathers up 250 pages from one site, moves to the next to gather another 250, then does the same thing with the third. In all, 750 pages of 1,500 are gathered -- 50 percent of all those available. That's a huge increase over the 17 percent that were gathered when you operated one big web site.
Root Page Advantage
The change also gives you more "root" pages, which tend to be more highly ranked than any other page you will have. The root page is whatever page appears when you just enter the domain name of a site. Usually, this is the same as your home page. For instance, if you enter this into your browser:
The page that loads is both the Search Engine Watch home page and the "root" page for the Search Engine Watch web server. However, if you have a site within someone else's web server, such as like this...
...then your home page is not also the root page. That's because the server has only one root page, whatever loads when you enter "server.com" into your browser.
So in our example, there used to be only one root page, that which appeared when someone went to "site.com," and this page had to be focused around all different product terms. Now, each of the new sites also has a root page -- and each page can be specifically about a particular product type.
Breaking up a large site might also help you with directories. Editors tend to prefer listing root URLs rather than long addresses that lead to pages buried within a web site. So to some degree, breaking up your site into separate sites should give each site more respect.
Some Final Words
If you decide to go the subdomain route, you'll need to talk with your server administrator about establishing the new domains. There is no registration fee involved, but the server company might charge a small administrative fee to establish the new addresses. Of course, you may also have to pay an additional monthly charge for each site you operate.
You could also register entirely new domains. However, I suggest subdomains for a variety of reasons. First, there's no registration fee to pay. Second, it's nice to see the branding of your core URL replicated in the subdomains. Finally, search engines have seemed to treat subdomains with as much respect as completely different domains. Given this, I see no major reason to register new domains.
Search Engine Submission Chart
At-a-glance guide to submission limits and timings for major crawler-based search engines.
Missing Pages At AltaVista
The Search Engine Update, March 3, 2000
Covers recent problems with people finding their pages removed and being unable to resubmit at AltaVista.
Longer Domain Names Arrive
The Search Engine Update, Jan. 4, 2000
Discusses issues relating to having multiple web sites and domain names.
Microsoft Takes Stake In RealNames
Microsoft has taken a 20 percent share in RealNames and is promising native support of the system within its browser by the spring of this year. In a release, both companies also said that they are "dedicated to ensuring the adoption of an open and well-documented standard for how names are allocated in order to preserve the user experience and trademark/brand ownership." To date, the authority to issue particular RealNames has resided with RealNames itself, opening it up to criticism in how it has granted generic terms to some companies. Expect a longer look at this new partnership in the next newsletter.
Microsoft Takes Stake In RealNames
SearchEngineWatch.com, March 15, 2000
Has links to recent articles about the partnership and past articles from Search Engine Watch about RealNames.
Google Adds Directory
Google has added a version of the Open Directory to its listings. In particular, sites within categories have been ranked in order of popularity, as determined by Google's link analysis system. To access the directory, just search at Google and look for "Relevant Category" links at the top of the results page. Or, you can also browse categories from the Google directory home page. There will be a longer review in the next newsletter.
Google Directory Home Page
Search Engine Articles
GoHip Answers Its Critics
Wired News, March 15, 2000
More about the search service that has garnered complaints about modifying people's email, including a contact number for complaints.
Boom Town: 'Grumpy' won't say what's next for Yahoo, but scenarios abound
Wall St. Journal, March 6, 2000
Various guesses about business deals Yahoo might cut, with Yahoo itself making no comments other than nothing is being ruled out.
The Web: Growing by 2 Million Pages a Day
Industry Standard, Feb. 28, 2000
Compilation of interesting statistics on the growth of the web, from a variety of sources.
How do I unsubscribe?
+ Follow the instructions at the very end of this email.
How do I subscribe?
+ The Search Engine Update is only available to paid subscribers of the Search Engine Watch web site. If you are not a subscriber and somehow are receiving a copy of the newsletter, learn how to subscribe at: http://searchenginewatch.com/about/subscribe.html
How do I see past issues?
+ Follow the links at:
Is there an HTML version?
+ Yes, but not via email. View it online at:
How do I change my address?
+ Send a message to [email protected]
I need human help with my subscription!
+ Send a message to [email protected]. DO NOT send messages regarding list management or site subscription issues to Danny Sullivan. He does not deal with these directly.
I have feedback about an article!
+ I'd love to hear it. Use the form at
This newsletter is Copyright (c) internet.com corp., 2000
Introducing... ClickZ Live!
SES Conference & Expo has merged with ClickZ to bring you ClickZ Live! The new global conference series takes on the identity of the industry's premier digital marketing publication, ClickZ.com, and kicks off March 31-April 3 in New York City. Join the industry's leading tech-advertisers in the advertising capital of the world! Find out more ››
*Super Saver Rates expire Jan 24.