views:

67

answers:

2

I have a project where I am required to subtract an empty template image from an incoming user filled image. The document type is a normal Bank cheque.

The aim is to extract the handwritten fields from it by subtracting one image from the empty template image.

The issue what i am facing is in aligning these two images, as there is scaling, translation, rotation etc

Any ideas on how to align the template image with the incoming image?

UPDATE 1:

I am posting an example image from the wikipedia page but in the monochrome format as my image is in monochrome format. alt text

+3  A: 

When working with Image processing for industrial projects we have in most of the cases a fiducial. A fiducial is like a mark - can be a hole, an cross mark - that never changes, is always in the same positions.

Generally two fiducials are enough to correct misaligning problems like rotation, translation and also scale. For instance If you know the distance between the two, you can always check it to make sure the scale factor is right, or correct it based on the difference of the current distance against the right distance.

In your case, what I would ask you is: Does the template and the incoming image share any visual sign that are invariant and can easily be segmented?

If you have the answer for that question, all the rest will be more simple - the difference itself is a quite straightforward algorithm.

Andres
If no fiducial Raj can go through : corner detection, matching, rotation/scale estimation, images aligning. But it depends of how fast it's must be...
Loïc Février
@Andres, Common features are pre-printed texts, Horizontal line etc Yes, i can use them as markers. Say if i want to find the line, i have written a connected component, but, it takes up all the connected ones and the bounding rectangle becomes very big in size. In this case, how can i optimise to actually segment the line. Processing time is a constraint in my case.
Raj
It's hard to say. Are you sure that the line is the best 'fiducial' that you can use? As @Loïc said, find the corners would be a better solution. But it depends of a lot of factors like your illumination. Without a sample of your image, is a bit hard to say how it could be done properly.
Andres
@Andres My image is a monochrome one, so no issue of illumination. I have updated my query with a sample image
Raj
+1  A: 

The basic answer is write a function that takes two images and a 2D transform and tells you how aligned they are once you apply the transform to the target image. The function needs to be continuous based on the transform and have a local minima (0) where the images are aligned perfectly. This is called a cost function.

Then use any optimization algorithm over the function and inputs -- you are trying to optimize the transform (translation, scale, rotation). Examples are hill climbing, genetic, simulated annealing, etc.

There are products that do this -- usually they are called Forms Recognition, Forms Registration, Forms Processing, etc. Some are SDKs, but there are also applications that can do it without programming.

Disclaimer: I work at Atalasoft, where we sell a Forms Processing add-on to our .NET imaging SDK.

Lou Franco
+1 @Lou, Thanks for the insight. Can you please point me to some examples/explanations for the cost function? or add some more detail on the transform which you are pointing?
Raj
The cost function is whatever you think determines an aligned image. If there is a border, and you can find it -- it could be the size, position and rotation difference of the border. You need to pick something that can be found regardless of size, scale and rotation. I wrote up a codeproject article that will give you some idea: http://www.codeproject.com/KB/showcase/SimpleOMRDotImage.aspx
Lou Franco