views:

233

answers:

3

I am scanning documents to JPG images. The scanner must scan all pages as color or all pages as black and white. Since many of my pages are color, I must scan all pages as color. After the scanning is complete, I would like to examine the images with .Net and try to detect what images are black and white so that I can convert those images to grayscale and save on storage.

Does anybody know how to detect a grayscale image with .Net?

Please let me know.

+3  A: 

A simple algorithm to test for color: Walk the image pixel by pixel in a nested for loop (width and height) and test to see if the pixel's RGB values are equal. If they are not then the image has color info. If you make it all the way through all the pixels without encountering this condition, then you have a gray scale image.

Revision with a more complex algorithm:

In the first rev of this post i proposed a simple algorithm that assumes that pixels are gray scale if each pixel's RGB are values are equal. So RGBs of 0,0,0 or 128,128,128 or 230,230,230 would all test as gray while 123,90,78 would not. Simple.

Here's a snippet of code that tests for a variance from gray. The two methods are a small subsection of a more complex process but ought to provide enough raw code to help with the original question.

/// <summary>
/// This function accepts a bitmap and then performs a delta
/// comparison on all the pixels to find the highest delta
/// color in the image. This calculation only works for images
/// which have a field of similar color and some grayscale or
/// near-grayscale outlines. The result ought to be that the
/// calculated color is a sample of the "field". From this we
/// can infer which color in the image actualy represents a
/// contiguous field in which we're interested.
/// See the documentation of GetRgbDelta for more information.
/// </summary>
/// <param name="bmp">A bitmap for sampling</param>
/// <returns>The highest delta color</returns>
public static Color CalculateColorKey(Bitmap bmp)
{
    Color keyColor = Color.Empty;
    int highestRgbDelta = 0;

    for (int x = 0; x < bmp.Width; x++)
    {
        for (int y = 0; y < bmp.Height; y++)
        {
            if (GetRgbDelta(bmp.GetPixel(x, y)) <= highestRgbDelta) continue;

            highestRgbDelta = GetRgbDelta(bmp.GetPixel(x, y));
            keyColor = bmp.GetPixel(x, y);
        }
    }

    return keyColor;
}

/// <summary>
/// Utility method that encapsulates the RGB Delta calculation:
/// delta = abs(R-G) + abs(G-B) + abs(B-R) 
/// So, between the color RGB(50,100,50) and RGB(128,128,128)
/// The first would be the higher delta with a value of 100 as compared
/// to the secong color which, being grayscale, would have a delta of 0
/// </summary>
/// <param name="color">The color for which to calculate the delta</param>
/// <returns>An integer in the range 0 to 510 indicating the difference
/// in the RGB values that comprise the color</returns>
private static int GetRgbDelta(Color color)
{
    return
        Math.Abs(color.R - color.G) +
        Math.Abs(color.G - color.B) +
        Math.Abs(color.B - color.R);
}
Paul Sasik
Some scanners will introduce a slight bit of color into otherwise black and white images. You should allow a small threshold for the colors to be not quite equal.
Andres
Wouldn't an image with RGB values of 128,128,128 at ALL pixels be considered just a (one-color-)gray rectangular picture?
chrischu
@crischu: Well, I think that was just an example of showing how all values would be equal.
Beska
Yannick M.
Just to allow for expected variation from scanning, I'd suggest ameliorating this a little. Doing something like: colorDiff = (Red - Blue) ^ 2 + (Red - Green) ^ 2. If colorDiff < COLOR_DIFF_MAX, presume grayscale -- I'd run the calculation on some known-grayscale scans to find a reasonable value for COLOR_DIFF_MAX.
Conspicuous Compiler
@Beska i know that it was just an example. Still my statement still has its value because it doesn't matter if the example values are 128, 3, or 42 the picture that fulfills this check is a picture of a SINGLE color and not a graySCALE picture.
chrischu
@Kigurai: So is this plain wrong or not the only way? It cannot be both. i was aiming for vanilla simplicity first. "simple algorithm to test for color" This morning i followed up with a more complex example that will allow for slightly "off" gray.
Paul Sasik
On second though, I might have mixed things up in my head and made a bit too quick judgement. Removing my previous comment, and the downvote.
kigurai
+10  A: 

If you can't find a library for this, you could try grabbing a large number (or all) of the pixels for an image and see if their r, g, and b values are within a certain threshold (which you might set empirically, or have as a setting) of one another. If they are, the image is grayscale.

I would definitely make the threshold for a test a bit larger than 0, though...so I wouldn't test r=g, for example, but (abs(r-g) < e) where e is your threshold. That way you can keep your false color positives down...as I suspect you'll otherwise get a decent number, unless your original image and scanning techniques give precisely grayscale.

Beska
+1, especially for the threshold suggestion.
Chris W. Rea
Somebody unfairly downvoted you without commenting. Tsk tsk.
Chris W. Rea
Yeah...happens. Sad, but *shrug*.
Beska
I would not use a library to do something as simple as detecting whether or not R=G=B
Ed Swangren
Or check a delta against some threshold, which would be the better approach.
Ed Swangren
@Ed: Isn't that pretty much what I said?
Beska
@Beska - thank you for your help. You and psasik gave me the information I needed. I wish I could mark to answers.
Dave
No problem at all. Glad to help.
Beska
A: 

As JPEG have support for metadata, you should first to check if your scanner software place some special data on saved images and if you can rely on that information.

Rubens Farias
This doesn't make sense to me. The scanner software, if it writes metadata into the file, will write that the image is a color image if it is scanned as color (which it is), even if the image only contains grayscale content.
Beska
It was an idea and I pointed out to validate this hypothetical data, beska. Anyways, ty for your comment.
Rubens Farias