I play with open-source OCR (Optical Character Recognition) packages periodically. My last foray was a few years ago when I bought a tablet PC and wanted to scan in some of my course books so I could carry just one thing to school. I tried every package I could find, and none of them worked well enough even to consider using. I ended up using the commercial version of Adobe Acrobat, which allows you to use the scanned page as the visual (preserving things like equations in math books), but it applies OCR to the text so you can search. It ended up being quite handy, and I was a little sad that I was incapable of getting any kind of result with open-source offerings.
Full story »