Optical character recognition (OCR)

Image Intrinsic Dimension

About

Optical character recognition, usually abbreviated to OCR, is the translation of:

Software

Tesseract

All the open source software are based on the tesseract OCR engine.

Command line

tesseract imagename outputbase [-l lang] [--oem ocrenginemode] [--psm pagesegmode] [configfiles...]

VietOCR (GUI for Tesseract )

VietOCR is the only one that I found which:

  • simply works
  • is easy to use.
  • gives great result for screen shot. You have still to check the “ScreenShot Mode” from the Image menu.

Vietocr Screenshot Mode

Free OCR

For windows, you have also FreeOCR 2.6. It work well but you can't process more than one page at a time. May be it's the good way because you always need to clean the result.

Library

Documentation / Reference





Discover More
Data Mining Tool 2
Keras

is a high-level neural networks API developed with a focus on enabling fast experimentation. Recognizing handwritten digits from the MNIST_databaseMNIST dataset. . See ...
Data System Architecture
Text - Optical character recognition (OCR)

Optical character recognition, usually abbreviated to OCR, is the translation of: raster images into character (ie text) See



Share this page:
Follow us:
Task Runner