views:

3095

answers:

11
+8  Q: 

OCR Web Service

I am searching for an OCR web service (eventually open source, preferably free) that simply receives an image and returns the text of the image in writing.

I've looked at tesseract, OCRopus and GOCR but the only open server I could find is WeOCR. Unfortunately the detection rates (at least during my tests) are sub-par and the speed is not much better.

Does anyone have any experience with OCR web services? I guess the license of tesseract allows the operation of such a service, are there any out there?

+2  A: 

I'm doubtful that there's a free OCR service that works well. Even pay OCR can be a bit dodgy.

Maximillian
Tesseract should be kind of acceptable, but WeORC is simply not. Some OCR-company could install a "pay per page"-plan using web services. It doesn't have to be free.
sdfx
+2  A: 

If you're using .NET, you could build one quite easily using tessnet. In fact, I'm about to try that myself :-)

Mauricio Scheffer
+6  A: 

I'm interested in the same thing. Some things I've found:

OCRTerminal bills itself as a "Free web OCR service", but the account is limited to 30 docs. And the recognition on my iphone page image was completely useless.

I really thought that anything could beat OCRTerminal's performance, but I when I tried Qipit, I couldn't get it to even show me that I had uploaded a document. Maybe it really wants me to email to it, instead of using their webform. Meh.

Evernote does pretty well searching for keywords in your images, but I haven't found a way to get the text out of the service.

For engines to use to /make/ such a web service:

http://groundstate.ca/ocr Has an 5/2007 review of several open source linux OCR packages.

Tesseract or OCRopus look like probably the best bets.

Cuneiform is supposed to be an enterprise-ready, internationalized open source OCR system, but the web pages I've found seem very incomplete and/or poorly translated: http://www.cuneiform.ru http://en.openocr.org Someone ported the source to linux: https://launchpad.net/cuneiform-linux

Hi! I'm one of the guys working on OCR Terminal. Thanks so much for trying our service out! To clarify, we provide 30 free pages a day, not documents. We're working on a payment system for people who need to scan longer documents, as well as a full API, but that won't be ready for a while.
Gaurav
Also: our service was designed for recognition of scanned documents, and does that fairly well. We haven't really worked on getting it to do photographs properly yet; but if you send us an example photograph, we'll try and tweak the software to get it working better. Cheers!
Gaurav
A: 

You might want to try www.p2escan.com/ocrtool.aspx it is completely free and will provide you with a serchable pdf or just plain text. They use a very accurate engine that runs on the Amazon S3. Everything is encrypted and when it is done it is emailed back to you. Hope it helps.

And by "completely free" you mean "one free upload per week". A Plus membership costs $40 a month and you still got a limit of 10 documents per week.
sdfx
A: 

you can try http://www.synchronice.it it's a very good webbased and online ocr tool.it's Free with only limit of 10 files /day

A: 

No one has mentioned DocMorph ? http://docmorph1.nlm.nih.gov/docmorph/AccessibleOCR.htm This service works reasonably well and has been around for a while. Your tax dollars at work at the National Library of Medicine.

cboe
A: 

Hi there, I'm currently working on an OCR web service, but it's a commercial offering. The system we're building is runs in a cloud infrastructure (probably Rackspace, possibly Amazon).

We have a high spec commercial OCR engine, and will be able to return text or searchable PDF files from the Service. As a true SaaS, users will be billed on a per page basis.

+1  A: 

Use the Google Docs OCR API!

powtac
A: 

I have achieved good results using this one http://www.free-online-ocr.com . It supports both DOC and searchable PDF as an output.

Ra
+1  A: 

Take a look at WiseTrend ( http://www.wisetrend.com/wisetrend_ocr_cloud.shtml ) - REST API for OCR based on the ABBYY engine (great for low-quality images) with per-page pricing and a free trial available

Eugene Osovetsky
A: 

Try http://www.onlineocr.net/Default.aspx

It's free, not open source

m2green