I have a "searchable pdf" aka 'image files with invisible but selectable text'. (When this file is opened in Acrobat, I am alerted "You are viewing this document in PDF/A mode.")
I need to extract the bounding rectangle of each word in this document. Any suggested toolkits and the methods for accessing the "invisi-text" words' bounding-boxes?
I would prefer tools in java, but appreciate any suggestions.