Optical character recognition (OCR)

About

Optical character recognition, usually abbreviated to OCR, is the translation of:

Software

Tasseract

All the open source software are based on the tesseract OCR engine.

Command line:

tesseract imagename outputbase [-l lang] [--oem ocrenginemode] [--psm pagesegmode] [configfiles...]

VietOCR

VietOCR is the only one that I found which:

  • simply works
  • is easy to use.
  • gives great result for screen shot. You have still to check the “ScreenShot Mode” from the Image menu.

_

Free OCR

For windows, you have also FreeOCR 2.6. It work well but you can't process more than one page at a time. May be it's the good way because you always need to clean the result.

Documentation / Reference


Powered by ComboStrap