views:

64

answers:

3

I am working on a project that requires classification of characters and symbols (basically OCR that needs to handle single ASCII characters and symbols such as music notation). I am working with vector graphics (Paths and Glyphs in WPF) so the images can be of any resolution and rotation will be negligable. It will need to classify (and probably learn from) fonts and paths not in a training set. Performance is important, though high accuracy takes priority.

I have looked at some examples of image detection using Emgu CV (a .Net wrapper of OpenCV). However examples and tutorials I find seem to deal specifically with image detection and not classification. I don't need to find instances of an image within a larger image, just determine the kind of symbol in an image.

There seems to be a wide range of methods to choose from which might work and I'm not sure where to start. Any advice or useful links would be greatly appreciated.

+1  A: 

Put all your images down into some standard resolution (appropriately scaled and centered).
Break the canvas down into n square or rectangular blocks.

For each block, you can measure the number of black pixels or the ratio between black and white in that block and treat that as a feature.

Now that you can represent the image as a vector of features (each feature originating from a different block), you could use a lot of standard classification algorithms to predict what class the image belongs to.

Google 'viola jones' for more elaborate methods of this type.

adi92
I tried something quite close to this. It works well for symbols that it has seen before but could be better for new fonts/variations. I'll do some googling.
AndrewS
+1  A: 

You should probably look at the paper: Gradient-Based Learning Applied to Document Recognition, although that refers to handwritten letters and digits. You should also read about Shape Context by Belongie and Malik. They keyword you should be looking for is digit/character/shape recognition (not detection, not classification).

carlosdc
Thanks, I'll take a look and post my results.
AndrewS
I ended up using the EigenObjectRecognizer class in EmguCV. Thanks for the keyword tip.
AndrewS
+1  A: 

If you are using EmguCV, the SURF features example (StopSign detector) would be a good place to start. Another (possibly complementary) approach would be to use the MatchTemplate(..) method.

However examples and tutorials I find seem to deal specifically with image detection and not classification. I don't need to find instances of an image within a larger image, just determine the kind of symbol in an image.

By finding instances of a symbol in image, you are in effect classifying it. Not sure why you think that is not what you need.

    Image<Gray, float> imgMatch = imgSource.MatchTemplate(imgTemplate, Emgu.CV.CvEnum.TM_TYPE.CV_TM_CCOEFF_NORMED);

        double[] min, max;
        Point[] pointMin, pointMax;
        imgMatch.MinMax(out min, out max, out pointMin, out pointMax);
//max[0] is the score
        if (max[0] >= (double) myThreshold)
        {
            Rectangle rect = new Rectangle(pointMax[0], new Size(imgTemplate.Width, imgTemplate.Height));
            imgSource.Draw(rect, new Bgr(Color.Aquamarine), 1);
        }

That max[0] gives the score of the best match.

Mikos
This is what I tried first (using the 'SURF feature detector' example). However I didn't know how to compare the results. It would find a bunch of features for a correct match and a bunch for an incorrect (but close) match. How do you know which set of feature matches is better?On a side note, SURF is rotation-invariant (which is very cool), but probably detrimental for my case.
AndrewS
You do know that you have a match score for each match (SURF or Template matching), which gives you the closeness of the match. You can also set a threshold for the ExhaustiveTemplateMatching class which lets you weed out less relevant ones.
Mikos
I thought that must've been the case but I couldn't find it. Thank you.
AndrewS