Google Opens Tesseract OCR Software

The Google Code Blog announced that Google has “re-released” the Tesseract OCR software to the open source community. OCR, optical character recognition, is the technology for converting text on a physical paper into computer based text. So if you have a ton of papers you typed up in your college days and you want them stored in digital format, you can use OCR to translate those documents for you.

Tesseract was originally developed by HP between the years of 1985 and 1995. In 2005 HP and University of Nevada in Las Vegas opened it to the community. Google claims that Tesseract OCR is “far more accurate than any other Open Source OCR package out there.” Some more detail at

Related reading

Robot sitting on a bunch of books. Contains clipping path
A picture of the 'Springfield Shopper' newspaper from The Simpsons, bearing the headline "LOCAL MAN LOSES PANTS, LIFE"
A screenshot of visual search on Pinterest. On the left is a picture of a copper angle-poise lamp, with the words 'Visually similar results' above it. Down the right-hand side are a number of pins showing similar lamps.
Simple Share Buttons