As most readers know, Search Engine Watch concentrates primarily on developments in searching across the entire web. However, there's an entire other category of search that's of interest to many people -- search tools and products that make sites and intranets searchable.
In this issue of SearchDay, guest writer Avi Rappoport provides an update on developments in the world of search products. Avi is a respected expert on this subject and operates an outstanding site on the subject, SearchTools.com. You can visit it via the URL below:
And now, on to news about search tools and products...
Metadata Search: Report Update
Metadata, structured information about documents, can improve search engine results significantly. This report covers metadata and search engines, including new resources such as XML and RDF metadata, the Dublin Core NISO standard, Adobe XMP metadata within files, and topic maps.
Searching PDF: Report and Listings Update
Advice for web site and intranet managers on site search engines and PDF files includes suggestions for preparing PDF for searching, the new Adobe XMP metadata, identifying PDF indexing problems, and displaying PDF files in search results. Lists 44 site search engines which index and search PDF files.
Open Source Search Engines: Listings Update
Now includes a summary of Eric Lease Morgan's comparative review of eight leading open source search engines, as well as listings for twenty open-source search engines.
Automatic Categorization Report Update
Automatic categorization is a hot topic these days, as the next
frontier in search and navigation functionality. Large web sites and
intranets need tools to group their reams of information into
coherent categories, so they're looking to automated systems.
Grouping search results by category can also provide context and
allow searchers to locate the most fruitful areas quickly. Our
report now has links to some excellent articles on this topic, while
the Classification Tools page lists many new products.
Search Engines for Databases Report Update
Database search and text search functionality is merging, to the benefit of end-users. Databases are starting to improve their ability to index and search large amounts of text, while text-search engines are storing more database structure such as field names and value formats (number, date, price, etc.). This report describes the advantages of each approach and links to database search software.
Commerce Search Engines: Report Update
The quality of the search engine on an online store has a direct relationship to that store's bottom line, so it's even more important to make it work! Research analyst reports describe common problems with product catalog searching. This report includes a checklist of the most important functions and interface elements of an e-commerce search engine, and a new listing of the most prominent search engines.
New Search Engine Implementation Consultants Report
This page lists consultants who can help with installing, configuring or tuning a search engine for your site. This is not a recommendation, simply a list. If you are a consultant, please contact searchtools.com to get added to this list.
Designed for online catalogs and e-commerce, this Java search engine indexes database fields as well as HTML and other text files.
This search code library uses linguistic analysis to improve retrieval, based on research from Lernout & Hauspie.
Free open-source Perl search engine works with Arabic and Roman code pages, allows a customized header and footer for results pages.
Educesoft Windows ASP Search Engine
Search engine uses Windows ASP (Active Server Pages), indexes using file system, provides a browser administration interface and highlights matched text in title/description fields in search results.
Fuzzy searching for structured data, works with relational databases and standard domain vocabulary.
Scalable Windows search engine provides extensive control for indexing spider, multiple languages, search zones, customized results formatting and relevance rankings, and search logging.
Perl search engine for Unix and Windows is designed for topical portals. It uses an indexing robot to gather pages, provides customizable templates, a special relevance algorithm and results grouped in categories.
Multilingual search engine with robot crawler, scales to very large numbers of documents. Includes language identification and linguistic analysis, clustering and categorization, many file formats and all standard query formats. Java administration interface, available on Windows, Unix and OS/390.
Juggernautsearch Perl Portal Search Engine
Perl search engine designed to scale to millions of documents, Pro version adds sophisticated indexing controls.
Indexes and stems both Arabic and Roman text, scales up for large sites and topical portals. Includes a customizable header and footer for results pages. Runs on Unix and Windows.
Orangevalley Intranet Search Engine
Windows search engine with a spider for crawling intranets, uses ASP for searching. Search results show a snippet of text with the match words highlighted, searches are logged for later analysis.
Designed for Knowledge Management, e-commerce and complex customer support applications, uses natural-language processing when possible. Runs on Unix and Windows.
Search engine for both structured (databases and XML) and free-text searching, this system is often used for e-commerce sites, integrated as Java middleware. It has realtime index updating, spellchecking, custom synonym listings, clustering search results into categories, and very fast retrieval.
Smaller-scale Windows search engine, but otherwise similar to Enterprise Search.
Free search engine indexes using the local file system, provides templates for results page customization.
Those in desperate need of a search engine for the VMS operating system should contact A/WWW Enterprises at www.awcubed.com which has ported the open-source search engine ISearch to VMS.
The open-source search engine ht://Dig has reported a security vulnerability, and posted updates and patches to fix these problems. Administrators running versions 3.1.0b2 through 3.1.5, and 3.2 betas should update immediately.
Obsolete and Discontinued Search Engines
- JHLSearch Discontinued
Java Search engine no longer available.
- RightSearch Search Engine Acquired
Company has been bought and the technology incorporated into other applications.
- Search-It Service Discontinued
The search server has not responded to queries for the last week, nor does anyone answer email, so I think this service has been discontinued.
- SeekIt Service Discontinued
The server does not respond, nor is there any way to contact the company.
- Twirlix Directory Discontinued
Portal ASP remote search service has closed down.
- XRS and BUS XML Search Engines No Longer Available
These XML search engines were the projects of a professor who has moved on, and the pages are no longer accessible.
NOTE: Article links often change. In case of a bad link, use the publication's search facility, which most have, and search for the headline.