Here's a tough one:
I need to be able to find a word's position and size (its frame) on the screen (its first occurence is enough, from there I should be able to get the next ones).
For example, I would like to be able to detect word positions in (but not limited to) Word, Excel and PowerPoint for Mac, as well as Safari and others.
The solution should be as fast as possible; I should be able to find at least 5-6 words per second and use as little CPU time as possible.
Here's what I thought of so far:
- OCR in a window's screenshot / graphics context (any good Open Source framework that works on Mac OS X 10.4 and that can be used in a commercial product?). Evernote is very good at spotting words in images. I don't know if it uses a custom in-house engine or an Open Source / commercial one but that would be the kind of engine I would like to use if this is a "valid" solution. Ideally I would detect the word's frame in the active application's window (how to get the frame of another application?).
- Getting some kind of "hook" on Quartz drawing of text and intercepting the location of the word when it's drawn (does not seem very feasible at first glance!).
- AppleScript, but it depends a lot on what API the application offers (I don't think you can get a word's coordinates in a Word document from what I've seen) and it's slow.
- ... out of ideas ...
My goal is to get all the word's frames in a paragraph in the right order based on a string containing the text of the paragraph.
Thanks in advance for any hints!