views:

28

answers:

0

I just installed Tesseract OCR and was able to convert a TIF image to its corresponding text. The application seems fairly easy to use, but I am struggling finding the documentation that will help me make the most of it. So, here are a couple of questions I hope someone here can help me with:

  1. I'm converting PDFs to TIFs using ImageMagick. What settings do I need to use when I make this conversion? Basically, what image settings would be optimal for OCR?

  2. How do I use hOCR?

Thanks.