views:

339

answers:

3

Hi,

I need to find out the word in an image where user has clicked. So far i have succeeded in OCRing the image. I have a picturebox control in my c# app. user can draw a box around any text and drag it to a textbox to fill the textbox with it. I have completed this. But now i have a new requirement saying user can select a textbox and then click on a word in the image so that the word is filled in the text box.

I have no clue on how to proceed. How can i cut the word portion alone from the image and give it to OCR. User can click on any part of the word.

Please give me any pointers. What algorithm should i follow to find the boundaries of the word on which user clicks. IF i found the boundaries i can cut it from the image using copyfromscreen method and will give it to ocr engine to get the text.

Hope i made my problem clear here.

Thanks and Regards, Dinesh.

+2  A: 

If you have the OCR working, my initial approach would be to attempt some sort of search centering on the initial click point.

Ie. Make a small box around where the user clicked, OCR, if all noise, make a bigger box, OCR, repeat until the OCR results in a hit.

Gregory
Hi,I tried this, but the problem is the ocr is very slow for even a small image. so this takes lot of time, i have got it somewhat correct if the user clicks in middle of the word. i am keeping on increasing the width of the box till the number of words i get is 2. But this is not the optimal solution is it?
Dinesh
+1  A: 

If you've got the OCR data, depending on the OCR library, you might be able to perform a reverse lookup and determine the character at the specified pixel coordinates. The OCR libraries I've worked with provide rectangle coordinates for each character, which in turn can be grouped into words (combining the rects). The problem then is simply to determine inside which rectangle the click occurred.

codelogic
Hi,I am using Microsoft Office Document Imaging 12.0 Library. Is it possible for me to get the rectangular co-ordinates of the characters in the image using this library? If not what other library i can use. Can u tell me which libraries u have worked with. I can certainly get the word if i get all the rectangular co-ordinates. Thanks very much.
Dinesh
+1  A: 

Hi friends, Thanks for your support. I got it working with MODI 12.0 itself. Thanks for codelogic to point me in the right direction. I found out in MODI12.0 there is a way to get the rectangular co-ordinates of all the words in a image, from there my job got very easy. Thanks a lot. :)

Dinesh