Home > Uncategorized > free OCR solutions

free OCR solutions

You have an image with some text / digits, and you want to convert it to text. Then you can process the text with a program easily.

Use an OCR. Let’s see two free solutions:

(1) tesseract
Tesseract is probably the most accurate open source OCR engine available. Combined with the Leptonica Image Processing Library it can read a wide variety of image formats and convert them to text in over 60 languages. It was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but since then it has been improved extensively by Google. It is released under the Apache License 2.0.” (source)

Under Ubuntu you can install it with the good old apt-get.

Let’s take the following image:

Print the result to the standard output:

$ tesseract numbers.jpg stdout

Print the result to a file:

$ tesseract numbers.jpg output
$ cat output.txt 

As can be seen, the extension “.txt” is added automatically.

(2) online OCR
Just google “online OCR” :) Free OCR worked pretty well for me. Just upload your image, fill out a captcha and there you go.

  1. No comments yet.
  1. November 9, 2014 at 09:01

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: