I've been playing around with image processing lately, and I'd like to know how the unsharp mask algorithm works. I'm looking at the source code for Gimp and it's implementation, but so far I'm still in the dark about how it actually works. I need to implement it for a project I'm working on, but I'd like to actually understand the algorithm I'm using.
Unsharp Mask works by generating a blurred version of the image using a Gaussian blur filter, and then subtracting this from the original image (with some weighting value applied), i.e.
blurred_image = blur(input_image)
output_image = input_image - blurred_image * weight
The key is the idea of spatial frequency. A Gaussian filter passes only low spatial frequencies, so if you do something like:
2*(original image) - (gaussian filtered image)
Then it's effect in the spacial frequency domain is:
(2 * all frequencies) - (low frequencies) = (2 * high frequencies) + (1 * low frequencies).
So, in effect, an 'unsharp mask', is boosting the high frequency components of the image --- the exact parameters of the gaussian filter size, and the weights when the images are subtracted determine the exact properties of the filter.
For more information have a read of ~page 70 of this document.
I wasn't sure how it worked either but came across a couple of really good pages for understanding it. Basically it goes like this:
- What's the opposite of a sharpened image? A blurry one. We know how to blur an image. Duplicate the original image and perform some Gaussian blurring. This is the Radius slider on most USM dialogs.
- Well, if we subtract away the blurriness, we should be left with the parts that are high-contrast! Think about it: if you blur a sky, it still looks like a sky. Subtract the pixels and you get sky - sky = 0. If you blur a Coke logo, you get a blurry Coke logo. Subtract it and you're left with the edges. So do the difference
- Well what makes things look sharper? Contrast. Duplicate the original image again and increase the contrast. The amount by which you increase the contrast is the Amount or Intensity slider on most USM dialogs.
Finally put it all together. You have three things at this point:
- A high contrast version of your original image
- The difference of the blurred image and your original (this layer is mostly black). This layer is the unsharp mask
- The original
The algorithm goes like this: Look at a pixel from the unsharp mask and find out its luminosity (brightness). If the luminosity is 100%, use the value from the high-contrast image for this pixel. If it is 0%, use the value from the original image for this pixel. If it's somewhere in-between, mix the two pixels' values using some weighting. Optionally, only change the value of the pixel if it changes by more than a certain amount (this is the Threshold slider on most USM dialogs).
Put it all together and you've got your image!
Here's some pseudocode:
color[][] usm(color[][] original, int radius, int amountPercent, int threshold) {
// copy original for our return value
color[][] retval = copy(original);
// create the blurred copy
color[][] blurred = gaussianBlur(original, radius);
// subtract blurred from original, pixel-by-pixel to make unsharp mask
color[][] unsharpMask = difference(original, blurred);
color[][] highContrast = increaseContrast(original, amountPercent);
// assuming row-major ordering
for(int row = 0; row < original.length; row++) {
for(int col = 0; col < original[row].length; col++) {
color origColor = original[row][col];
color contrastColor = highContrast[row][col];
color difference = contrastColor - origColor;
float percent = luminanceAsPercent(unsharpMask[row][col]);
color delta = difference * percent;
if(abs(delta) > threshold)
retval[row][col] += delta;
}
}
return retval;
}
Note: I'm no graphics expert, but this is what I was able to learn from the pages I found. Read them yourself and make sure you agree with my findings, but implementing the above should be simple enough, so give it a shot!
References
Unsharp is usually implemented as a convolution kernel which detects edges. The result of this convolution is added back in to the original image to increase edge contrast which adds the illusion of additional "sharpness".
The exact kernel used varies quite a bit from person-to-person and application-to-application. Most of them have this general format:
-1 -1 -1
g = -1 8 -1
-1 -1 -1
Some leave the diagonals out, sometimes you get higher weighs and the whole kernel is scaled, and some just try different weights. They all have the same effect it in the end, it's just a question of playing until you find one that you like the end result of.
Given an input image I
, the output is defined as:
out = I + c(I * g)
, where *
is the 2D convolution operator and c
is some scaling constant, usually above 0.5
and less than 1
so you avoid blowing out any more channels than you have to.