We’ve just learned that GoHook is building a database of Adobe Acrobat (PDF) content.
GoHook used to provide an archive of completed eBay auctions. However, in March the service ended after the company ran into several problems with EBay.
The new GoHook PDF database includes about 500,000 documents. According to the company more than 10,000 documents are added to the database each week.
GoHook offers several options to view PDF content online. You can click the hyperlinked title and open the document in an Adobe Acrobat Reader or:
+ View the document converted into HTML
+ View a cached/archived version of the document in PDF.
Google, Yahoo, AJ, and other web databases that crawl PDF content DO NOT offer cached versions of these documents. The Wayback Machine does offer a small amount of archive PDF material.
+ View the document converted into text (txt) format.
GoHook results pages also contain the date the document was crawled along with its size.
You can use quotation marks to search phrases and a minus sign to exclude a term. A default “and” is used between terms.
Snippets do not show your search terms in context. This is a feature that would make GoHook much more useful. However, it’s possible to quickly open a text version of a document and then use edit/find to locate your search terms.
The company is also developing a database of .WAV sound files. At the moment GoHook Audio search contains only 5000 files.