Northern Light Adds Search Functions, Freshens Index
From The Search Engine Report
August 4, 1998
Northern Light has added new search functionality, and the service is moving forward in updating its index of the web, which has grown dated in the past months.
Using the new Power Search tab, users can narrow searches in a variety of ways. They can search for terms in the entire document, or just within the title or URL.
Northern Light's page classification types can also be used to narrow searches. These include Page Type, Language, Sources, and Subject.
Page Type is determined by an algorithm that classifies pages into categories such as "for sale" or "event listings." Language is determined by a dictionary-based system. Editors classify sites by Subject, into areas such as "arts" and "travel." Finally, Source is determined primarily by domain-filtering, to place pages into categories such as "personal pages" or "commercial web sites."
Northern Light can also sort listings by date, the only major search engine to offer this. Date reflects when a page was created or modified, though not all documents will have them, as some web servers fail to report a date. Northern Light also rejects dates if they are clearly inaccurate, which was an unexpected problem.
"We were shocked by how many web documents were dated in the future," said Northern Light's Director of Engineering, Marc Krellenstein
Sorting results by date can be of mixed value. After all, older documents are not necessarily less relevant. Also, some documents given minor changes may suddenly appear to be newer. Nevertheless, it's a nice option to have, and one that many users have requested.
Sorting by date is much more useful when searching within Northern Light's special collection which has documents from 5,500 publications. These are articles not available on the web, but they can be viewed for a fee (searching for them is free). Many of these are periodicals, and date sorting can help bring the newest articles on a subject to the top.
Northern Light now also supports various field commands, similar to those at AltaVista, Infoseek and HotBot. Use title: before search terms to look for them in the title, such as title:bill clinton. The text: works the same way but finds terms only in the body copy, while url: can be used to restrict searches to a particular site, such as url:netscape.com. The commands can also be combined, such as url:nasa.gov title:pathfinder to find all pages on NASA servers with Pathfinder in the page title.
Also, a special section within Northern Light has been set up in conjunction with Billboard for those interested in music. Searches can easily be restricted to music publications or music web sites, along with some other narrowing options.
Finally, the Northern Light web index had grown quite dated over the past weeks, a situation the service says it is now moving rapidly to correct.
Krellenstein said things fell behind as the service has been busy building up its special collection documents. But the web crawler has been playing catch-up, and it will continue to be kept busy.
"We have no backlog of data. We scaled up the crawler four-fold a few days ago, and in August we will intensify the crawl," Krellenstein said.
The immediate goal is to freshen the data in the existing index, though new finds will be added. By September, Northern Light hopes to have expanded its size well past 100 million web pages. It currently stands at 80 million.
Billboard Music Search
June 12-14, 2013: Join industry experts at SES Toronto for a crash course in the latest strategies in Online Marketing and Advertising.
Save $300 when you register by Thursday, May 23.