ansaurus

Question

Answer 1

+2 A:

I get quite good results setting

TessBaseAPI::SetVariable("tessedit_char_whitelist", "0123456789");

while gently urging the user to let the numbers fit in a certain box. This makes locating the numbers easier for me, and ensures the user keeps the image steady and at a reasonable distance leading to a sharper image.

I have thought about altering valid_word() in tesseract-2.04/dict/permute.cpp, but there seems to be no need for that.

The next step will be to hardcode a minimum/maximum char size so recognition time can become way less than the 500 ms it is now. Then the next step will be to add some code that keeps track of results in time, so that reading 5 90% of the time and 8 only 10% will lead the code to remember the 5.

It all depends on the use case you have. I'm lucky in the sense that I'm allowed to just show a 200x50 box which will contain the number.

mvds 2010-07-15 20:41:37

Good answer +1 for it. Can you tell me from where you got this idea or the place where these kind of things are documented

Madhup 2010-07-16 10:28:45

The whitelisting is documented (somewhere, google is your friend) and the other things are my plans, while waiting if the project will get a *go*.

mvds 2010-07-16 10:42:54

ansaurus

tags:

views:

answers:

Training tesseract to use with iPhone

related questions