This prototype is to recognize text inside the image and for that it uses Tesseract OCR. The underlying Tesseract engine will process the picture and return anything that it believes is text.
These are the following points we need to follow to use tesseract OCR iOS and get the better the output out of it.
As we all know training data is used to train an algorithm. Generally, training data is a certain percentage of an overall dataset along with a testing set. As a rule, the better the training data, the better the algorithm or classifier performs. Tesseract requires language-specific training data to perform predictions, here language-specific denotes that it predicts within the boundaries of a given language.
To add training data drag the tessdata folder and set the added Folders option to create folder references, It will create a referenced folder. Do not forget to select a target before clicking Finish. For this project we have only included English training files to tessdata folder. You can download and add tessdata as per your project requirements.
Image scaling is performed ultimately to achieve resolution enhancement without loss of image quality. We can implement this using an aspect ratio of an image that has a proportional relationship with image width and height.
Image noise is a random variation of brightness or color information in images and is usually an aspect of electronic noise. Removing noise from image improves its quality.
It can be used to recognize documents, receipts, and street-signs etc. Let's go through all of them with examples.
- Let’s consider an example of a picture of a book page.
output:
Mild Splendour of the various-vested Night!
Mother of sirildly-working visions! haill
I watch thy gliding, while with watery li ht
Thy weak eye glimmers through a fleecy veil;
And when thou lovest thy pale orb to shroud
Behind the ather'd blackness lost on high;
And when thou dartest from the wind-rent cloud
Thy placid lightning o'er the awaken'd sky.
- A slightly difficult example is a Receipt which has non-uniform text layout and multiple fonts. Book pages and documents have very well defined structure and very little variation in font sizes and equally spaced data which is not the case in bill receipts. These examples shows how tesseract will perform on scanned receipts.
output:
Store #05666
3515 DEL MAR HTS, RD
SAN DIEGO, CA 92130
(858) 792-7040
Register #4 Transaction #571140
Cashier #56661020 8/20/17 5:45PM
wellness+ with Plenti
Plenti Card#: 31)000000000(4553
1 G2 RETRACT BOLD BLK 2PK
1.99 T
SALE 1/1.99, Reg 1/4.69
Discount 2 70-
1 Items
Subtotal
1.99
Tax
.15
Total
2.14
*MASTER*
2.14
MASTER card * #)0()000000000(5485
App #AA APPROVAL AUTO
Ref # 05639E
Entry Method: Chip
- It can be used to recognize street signs as well, with this example we can see that how tesseract will behave when we pass image with symbols and dark boundaries.
- Tesseract does not do a very good job with dark boundaries and often assumes it to be text. However, if we help Tesseract a bit by cropping out the text region, it gives perfect output.
output:
2:fi)::s
Caution
Site traffic
- There is a mistake in output due to a symbol.
Tesseract OCR iOS and TesseractOCR.framework are distributed under the MIT license (see LICENSE.md).
Tesseract, maintained by Google (http://code.google.com/p/tesseract-ocr/), is distributed under the Apache 2.0 license (see http://www.apache.org/licenses/LICENSE-2.0).