Optical character recognition (OCR)

1 - About

Optical character recognition, usually abbreviated to OCR, is the translation of:

3 - Software

3.1 - Tasseract

All the open source software are based on the tesseract OCR engine.

Command line:

tesseract imagename outputbase [-l lang] [--oem ocrenginemode] [--psm pagesegmode] [configfiles...]

3.2 - VietOCR

VietOCR is the only one that I found which:

  • simply works
  • is easy to use.
  • gives great result for screen shot. You have still to check the “ScreenShot Mode” from the Image menu.

3.3 - Free OCR

For windows, you have also FreeOCR 2.6. It work well but you can't process more than one page at a time. May be it's the good way because you always need to clean the result.

4 - Documentation / Reference

Data Science
Data Analysis
Data Science
Linear Algebra Mathematics

Powered by ComboStrap