views:

432

answers:

2

Hi,

I intend to use a multi layer perceptron network trained with backpropagation (one hidden layer, inputs served as 8x8 bit matrices containing the B/W pixels from the image). The following questions arise:

  1. which type of learning should I use: batch or on-line?
  2. how could I estimate the right number of nodes in the hidden layer? I intend to process the 26 letter of english alphabet.
  3. how could I stop the training process, to avoid overfitting?
  4. (not quite related) is there another better NN prved to perform better than MLP? I know about MLP stucking in local minima, overfitting and so on, so is there a better (soft computing-based) approach?

Thanks

+2  A: 

Most of these questions are things that you need to try different options to see what works best. That is the problem with ANNs. There is no "best" way to do almost anything. You need to find out what works for your specific problem. Nevertheless, I will give my advice for your questions.

1) I prefer incremental learning. I think it is important for the network weights to be updated after each pattern.

2) This is a tough question. It really depends on the complexity of your network. How many input nodes, output nodes, and training patterns that there are. For your problem, I might start with 100 and try ranges up and down from 100 to see if there is improvement.

3) I usually calculate the total error of the network when applied to the test set (not the training set) after each epoch. If that error increases for about 5 epochs, I will stop training and then use the network that was created before the increase occurred. It is important not to use the error of the training set when deciding to stop training. This is what will cause overfitting.

4) You could also try a probabilistic neural network if you are representing your output as 26 nodes, each representing a letter of the alphabet. This network architecture is good for classification problems. Again, it may be a good idea just to try a few different architectures to see what works best for your problem.

Kevin Crowell
+1  A: 

Regarding number 3, one way to find out when your ANN starts to overfit is by graphing the accuracy of the net on your training data and your test data vs the number of epochs performed. At some point, as your training accuracy continues to increase (tending towards 100%), your test accuracy will probably start to actually decrease because the ANN is overfitting to the training data. See what epoch that starts to happen and make sure not to train past that.

If your data is very regular and consistent, then it might not overfit until very late in the game, or not at all. And if your data is highly irregular, then your ANN will start to overfit much earlier.

Also, a way to test how regular your data is is to do something like k-fold cross validation.

JoeCool