views:

108

answers:

2

Am looking for an ocr library in java for scraping details from an image (IELTS Certificate image) http://pbrusilovskij.net/wp-content/uploads/2008/09/ielts-sprachzertifikatt.jpg

Need to take out the details like Family Name,First Name etc from the image and put to database

A: 

You can try Google's OCR API: http://code.google.com/apis/documents/docs/3.0/developers_guide_protocol.html#OCR

Antonio
i need opensource code in java...google ocr api is not opensource....
Jinith
google ocr api also not able to recognise fiels of my image...even used Asprise ocr api still no useful results...
Jinith
If you're considering web APIs for OCR, take a look at http://www.wisetrend.com/wisetrend_ocr_cloud.shtml . Also not open-source, but should have better recognition quality.
Eugene Osovetsky
+1  A: 

In my opinion the answer to your question is: no, there is no Open Source solution for you, especially in Java. The task you are trying to solve is called Data Catpure, and there is billion+ dollars industry arround it. The one you trying to solve is Fixed Form processing, the easiest part of Data Capture. There are tons of commertial solutions arround that solve much harder problems easily. And yes, low contrast and texture background is not a problem.

Compared to industry solution, open source OCR tools are just trying to bear with basic OCR problems, I would not expect them working on this image, based on my experience. So if you would opt to go with Open Source OCR, you will anyway have to deal with commertial image processing solutions to get bi-tonal image that could ever be OCR-d.

My proposal - stop wasting your time and go buy solution that works. The time you will spend developing something your self will const your organization much more.

However, I do understand that my answer could be not really valid to you since you are not your boss, and it is in your best interest that this job goes to you, regardless this is highly ineffective way of getting things done.

Tomato