Hi experts,,,
I have a pdf document with content in Arabic language and when I try to search inside the document for a specific word, adobe reader returns no results.
it seems a format problem... how can I fix that? thanks.
Hi experts,,,
I have a pdf document with content in Arabic language and when I try to search inside the document for a specific word, adobe reader returns no results.
it seems a format problem... how can I fix that? thanks.
It might not actually be text, or it might be in a container that Reader doesn't pay attention to. It's especially common to expand text objects into vector shapes when you're dealing with fonts that most people aren't going to have installed on their system. It looks the same on the screen, but it's not searchable.
There are at least four different ways to get text into a PDF document (in order or likelihood):
Case 1 is typically searchable. Case 2 is searchable if the font and encoding are sane - if they're not (and this is likely the case for non-Latin fonts) then there is probably no reliable way to map the encoded glyphs back to Unicode (and by the way - PDF is fairly Unicode hostile). Case 3 is totally unsearchable without knowing more about how the PDF was generated. Case 4 is totally unsearchable.
That said, all cases cases be read with an OCR engine that understands Arabic. I understand that the Iris engine does Arabic.