views:

1602

answers:

4

Are there any free OCR libraries that work with PHP? Perhaps there is a different way to do OCR on a website. If so please share.

EDIT: Python is also acceptable.

EDIT2: I'm working with a Linux server, ideally.

+1  A: 

PHP is designed for web development. No one in their right mind would use it for something as computationally intensive as OCR. I would recommend that you search around for open source OCR solutions in general and then figure how how to use it with PHP after the fact. You'll likely just need to shell out to the appropriate executables.

It's not as elegant as having a native library, but if that's what you want, switch to a more general purpose language like Python (I don't know if there is an OCR library for Python, simply stating that Python is more flexible when it comes to libraries)

jamieb
OK, python works as well.
Moshe
+4  A: 

Since you're on a Linux box, I would highly recommend Google's open source project ocropus.

It's not PHP, but I think it will be your best option. Of course you can call it from within PHP via exec. Its mature and has a lot of options. From the project site:

The OCRopus engine is based on two research projects: a high-performance handwriting recognizer developed in the mid-90's and deployed by the US Census bureau, and novel high-performance layout analysis methods.

There is also another open source project, tesseract. I've used this in the past as well and have been pleased with the results. Includes training, limiting your alphabet, etc.

nategood
Thanks for the pointer. I'll let you know how they go. Thanks!
Moshe
Still haven't gotten to it yet... It was for a client who needed to put the project on the back burner for a while...
Moshe
A: 

If you don't mind calling an online API to do the OCR for you, check out http://www.webservius.com/corp/docs/wisetrend.pdf

Eugene Osovetsky
A: 

Have you seen phpOCR classes of Andrey Kucherenko ? http://webscripts.softpedia.com/script/PHP-Clases/phpOCR-12040.html It's an old article but may help you.

Daniel D