tags:

views:

150

answers:

2

I have a scanned image which is basically black print on some weird (non-gray) background, say, green or yellow (think old paper).

How can I get rid of the green/yellow and receive a gray picture with as much of the gray structure of the original image intact? I.e. I want to keep the gray around the letters for the anti-aliasing effect or for gray areas but I want to turn anything which even is remotely green/yellow to become pure white?

Note that the background is by no means homogeneous; so the algorithm should be able accept a color and an error margin or a color range.

For bonus points: How can I automatically determine the background color?

I'd like to use Python with the Imaging Library or maybe ImageMagick.

Note: I'm aware of packages like unpaper. My problem with unpaper is that it produces B&W images which probably look good for an OCR software but not for the human eye.

+1  A: 

I am more of C++ than python programmer, so I can't give you a code sample. But the general algorithm is something like this:

Finding the background color: You make a histogram of the image. The histogram should have two peaks representing the background and foreground colors. Because you know that the background has higher intensity you choose the peak with higher intensity and that is the background color. Now you have the RGB background (R_bg, G_bg, B_bg)

Setting the background to white: You loop over all pixels and calculates the distance to the background:

distance = sqrt((R_bg - R_pixel) ^ 2 + (G_bg - G_pixel) ^ 2 + (B_bg - B_pixel) ^ 2)

If the distance is less than a threshold you set the pixel to white. You can experiment with different thresholds until you get a good result.

Dani van der Meer
That is going to leave some jagged edges, but it's a good start.
Georg
True. You could also go the other way and try to label pixels as foreground color based on their distance from the foreground peak in the histogram, and turn everything else to white.
Dani van der Meer
Isn't RGB space wrong for this kind of application? Wouldn't HSV/HSL be better?
Aaron Digulla
I assumed you had an RGB image, so I described it like this. You could try other color spaces. HSV/HSL is not necessarily better. It better describes how human perceive color, but it is not certain that it lead to better segmentation of the image you have.
Dani van der Meer
I have an RGB image but I can convert it. My problem is to select the right colors. Your algorithm works if the background has a uniform color but in my case, it's very uneven - for a computer.
Aaron Digulla
What about trying to segment the foreground (print), and setting everything else to white. you can plot the histogram of the image (possibly by using a tool like GIMP) to get a feel for how good the foreground/background peaks of the histogram are defined.
Dani van der Meer
+1  A: 

I was looking to make an arbitrary background color transparent a while ago and developed this script. It takes the most popular (background) color in an image and creates an alpha mask where the transparency is proportional to the distance from the background color. Taking RGB colorspace distances is an expensive process for large images so I've tried some optimization using numpy and a fast integer sqrt approximation operation. Converting to HSV first might be the right approach. If you havn't solved your problem yet, I hope this helps:

from PIL import Image
import sys, time, numpy

fldr = r'C:\python_apps'
fp = fldr+'\\IMG_0377.jpg'

rz = 0  # 2 will halve the size of the image, etc..

# ----------------

im = Image.open(fp)

if rz:
    w,h = im.size
    im = im.resize((w/rz,h/rz))
    w,h = im.size

h = im.histogram()
rgb = r0,g0,b0 = [b.index(max(b)) for b in [ h[i*256:(i+1)*256] for i in range(3) ]]

def isqrt(n):
    xn = 1
    xn1 = (xn + n/xn)/2
    while abs(xn1 - xn) > 1:
        xn = xn1
        xn1 = (xn + n/xn)/2
    while xn1*xn1 > n:
        xn1 -= 1
    return xn1

vsqrt = numpy.vectorize(isqrt)

def dist(image):
    imarr = numpy.asarray(image, dtype=numpy.int32)  # dtype=numpy.int8
    d = (imarr[:,:,0]-r0)**2 + (imarr[:,:,1]-g0)**2 + (imarr[:,:,2]-b0)**2
    d = numpy.asarray((vsqrt(d)).clip(0,255), dtype=numpy.uint8)
    return Image.fromarray(d,'L')

im.putalpha(dist(im))
im.save(fldr+'\\test.png')
bpowah
Thanks, I'll give it a try.
Aaron Digulla