views:

132

answers:

2

I am using tessnet2 to extract the sentence in this img file. When i call the tessnet2 func using bmp it fails (it returns "~" as my sentence) and when i use bmp2 instead it works. WTF, why?

The reason why i am am doing FromFile is because i am grabbing the image from my server and using Image.FromStream to directly load instead of saving to a file. Why are these two different and what can i do to get the tessnet2 func to work as bmp2 does?

            img = System.Drawing.Image.FromFile(imgUrl);
            var bmp = new System.Drawing.Bitmap(img);
            var bmp2 = new System.Drawing.Bitmap(imgUrl);
A: 

I'd recommend saving each of the images out to disk after each of the 3 steps. So you'll end up with 3 files (img, bmp & bmp2).

Then use something like Paint.NET to subtract 1 image from another:

  • img - bmp
  • bmp - bmp2
  • bmp2 - img

If any of the results aren't a completely blank image, then the images in the 3 steps are different.

I've they're the same then I can only think that something is wrong with the tessnet2 library as it's producing different results when you call if on identical images!! Could it be on the very edge of an acceptable read, are there any settings you can set in the library to make it more tolerant?

Matt Warren
A: 

Instead of calling FromFile, try using FromStream. You could try something like this:

MemoryStream ms = new MemoryStream(File.ReadAllBytes(imgURL));
img = Image.FromStream(ms);

The Image class likes to lock up files, and this may be what's causing a bad read from tessnet2.

Jacob Ewald