views:

331

answers:

1

I'm using my digital camera as a quick and dirty scanner. Resolution is actually around 300dpi, which is quite reasonable. But my camera produces a color image, which I want reduced to a bitmap. I do not want to dither the image; I'm looking for what I would get if I put the document through a black-and-white scanner. Converting a JPEG to a greyscale image is easy and standard using djpeg -grayscale. The hard part is deciding which gray pixels should be white and which should be black.

The pbmplus tools offer

djpeg -grayscale -pnm img.jpg | pgmtopbm -threshold -value $v > img.pbm

But the killer is that value $v. Good values seem to range anywhere from 0.3 to 0.6, and repeated trial and error by hand is killing me. (For those more familiar with ImageMagick, the $v at hand is the value of the -black-threshold parameter.)

I suppose I could build a GUI that would help me find a threshold faster by hand, but what I'm really looking for is and algorithm to set threshold to convert a greyscale image to a clean bitmap. Ideally this would work just by examining the structure of the grayscale image!

A: 

Colorspace conversion is normally done using device dependent profiles.

Your camera is producing an image according to its color profile and, if you had a scanner, it would be producing an image accoring to its color profile.

Provided that you had an ICC or ICM profile for your camera and one that came close to matching the output of your hypothetical scanner device you could use ImageMagick or Little CMS to apply the profiles to your data.

Since I doubt that you have an ICC or ICM profile for your camera you could just choose one of the many free ones available on the Internet. An appropriate graycale output profile may be a bit more difficult to find.

If you can't apply device specific profiles you may find that ImageMagick can do what you need with it's -colorspace command.

The ICC (International Color Consortium) has some programs and information you may find useful.

algorithm to set threshold to convert a greyscale image to a clean bitmap

If all else fails you might be able to derive a value from the histogram of a grayscale image?

(Your post is a little confusing as you state your camera outputs a color image but you want that image to appear as if it were run through a black and white scanner. Then later on you state you want an algorithm to convert a grayscale image to a "clean" bitmap." Please clarify.)

Edit - In case you are still looking for an answer, ImageMagick can be used to apply an ICC profile. Also this page of scanner profile info may be of some help.

convert RGB.jpg -profile "Scanner.icc" BW.jpg
Getting to grayscale is no problem. I've clarified that and added an ImageMagick example. The question is how to find the threshold value. I'm sure looking at a histogram might help. But then what?
Norman Ramsey
Without some type of resampling (bicubic, linear, etc..) I don't know that a formula exists. There are several pieces of data that can be derived from the histogram like the range of actual values and the actual values themselves. From there you're on your own. This link might help: http://stackoverflow.com/questions/1173200/need-c-function-to-convert-grayscale-tiff-to-black-white-monochrome-1bpp-tif
I stand corrected it's not resampling it's color quantization. http://en.wikipedia.org/wiki/Color_quantization
Another thought, Load the image into PhotoShop or The Gimp and assign a monochrome palette to it. What adjustments/options do those programs give? Perhaps you can extrapolate something from them.