views:

345

answers:

5

I understand the way how the neural net is working, but if I want it to use for image processing, actually character recognition, I can't understand, how can I input the image data to the neural net, if I can have a very big image of an A letter!

May be I should try to get some info from the image, some specifications of the image, and then use a vector of values of that specification?

And they will be the input for the neural net, who have already done such a thing, please can you explain me my problem?

+1  A: 

The name for the problem you're trying to solve is "feature extraction". It's decidedly non-trivial and a subject of active research.

The naive way to go about this is simply to map each pixel of the image to a corresponding input neuron. Obviously, this only works for images that are all the same size, and is generally of limited effectiveness.

Beyond this, there is a host of things you can do... Gabor filters, Haar-like features, PCA and ICA, sparse features, just to name a few popular examples. My advice would be to pick up a textbook on neural networks and pattern recognition or, specifically, optical character recognition.

Martin B
can you suggest some good books about OCR?
Dzen
Not really my specialty, but a quick search turns up "Feature Extraction Approaches for Optical Character Recognition" by Roman Yampolskiy, which looks as if it might contain what you're after.
Martin B
+4  A: 

The easiest solution would be to normalize all of your images, both for training and testing, to have the same resolution. Also the character in each image should be about the same size. It is also a good idea to use greyscale images, so each pixel would give you just one number. Then you could use each pixel value as one input to your network. For instance, if you have images of size 32x32 pixels, your network would have 16*16 = 256 input neurons. The first neuron would see the value of the pixel at (0,0), the second at (0,1), and so on. Basically you put the image values into one vector and feed this vector into the network. This should already work.

By first extracting features (e.g., edges) from the image and then using the network on those features, you could perhaps increase the speed of learning and also make the detection more robust. What you do in that case is incorporating prior knowledge. For character recognition you know certain relevant features. So by extracting them as a preprocessing step, the network doesn't have to learn those features. However, if you provide the wrong, i.e. irrelevant, features, the network will not be able to learn the image --> character mapping.

ahans
Are my steps of solving this question good?1. binarize image.2. segmentation. find connected parts of image. may be using contours.3. for each segment proceed it seperately from other segments.3.1 extract some information from image segment. 3.2 compare with some pattern or input it to the neuro net.so I have some questions. 1. if I segment image and i got a letter " i " there, the dot abouve it will be seperated from the segment. So how to handle this situation? May be add some special case.2. Should I resize the segment if it is very large or too small?
Dzen
Can I input different sized images to my neuro net? I don't think I can, but i am not sure. So image can contain some different sized letters, how to handle it?
Dzen
Your pre-processing steps sound like this could work, however, I would suggest starting with what I suggested in the first paragraph. It appears to me that you don't have that much experience with neural networks or character recognition. So in order to get a feeling for what works and how it works you should start with a simple case. Adding too many steps at once will increase the chance of some mistake and without a real idea of what to expect from each individual step you will have a hard time debugging your code.
ahans
You should not use different sizes, at least not as a first step. In theory, a neural network will be able to recognize everything it has seen during training, given it is trained with enough data and the network is large enough. However, in practice you will almost always want ot normalize your input images first. There are approaches that try to learn invariant of scale, like Yann LeCun's convolutional networks (see, e.g., http://yann.lecun.com/exdb/lenet for character recognition). However, I really suggest you start with the simplest approach, maybe not letters but just numbers at first.
ahans
Don't use images that contain different characters. Use images that contain just one character, have the same size, and the character in an image should also have the same size. To save you all the preprocessing work, get the ZIP code data from http://www-stat.stanford.edu/~tibs/ElemStatLearn/data.html (bottom of the page) and train your network with that. Here you don't have to deal with normalization and organizing pixel valus in a vector, it has all been done already. If your network is able to do something useful on that data, you can start adding additional steps yourself.
ahans
Do you mean that the only thing I need to do is to make Neural Network, train it and thats all?
Dzen
Yes, as a first step get those data, train your network with it, test it. And if it works well, you can start adding additional steps like using your own images.
ahans
+1  A: 

You can use as input the actual pixels. This is why sometimes it is preferable to use smaller resolution of the input images.

The nice thing about ANN is that they are somehow capable of feature selection (ignoring non-important pixels by assigning near-zero weights for those input nodes)

Amro
+1  A: 

Here are two examples on codeproject:

http://www.codeproject.com/KB/cs/neural_network_ocr.aspx

http://www.codeproject.com/KB/dotnet/simple_ocr.aspx

I would read these first and take some ideas...

Ezz
A: 

Here are some steps: make sure your color/ grey scale image is a binary image. To do this, perform some thresholding operation. following that some sort of feature extraction. For OCR / NN stuff this example might help , although in ruby : http://ai4r.rubyforge.org/neuralNetworks.html

Egon