views:

347

answers:

9

I have asked other AI folk this question, but I haven't really been given an answer that satisfied me.

For anyone else that has programmed an artificial neural network before, how do you test for its correctness?

I guess, another way to put it is, how does one debug the code behind a neural network?

A: 

I'm a complete amateur in this field, but don't you use a pre-determined set of data you know is correct?

Bravax
Yes, for training... although "correct" for input data isn't really the proper term. More specifically, you use a set of input data for which you expect to get a certain output, and you train the network to provide that result.
McWafflestix
It's also good to have another pre-determined known-correct set of data to test with. There's things that can go wrong when you're using the training data for validation.
David Thornley
+10  A: 

With neural networks, generally what is happening is you are taking an untrained neural network, and you are training it up using a given set of data, so that it responds in the way you expect. Here's the deal; usually, you're training it up to a certain confidence level for your inputs. Generally (and again, this is just generally; your mileage may vary), you cannot get neural networks to always provide the right answer; rather, you are getting the estimation of the right answer, to within a confidence range. You know that confidence range by how you have trained the network.

The question arises as to why you would want to use neural networks if you cannot be certain that the conclusion they come to is verifiably correct; the answer is that neural networks can arrive at high-confidence answers for certain classes of problems (specifically, NP-Complete problems) in linear time, whereas verifiably correct solutions of NP-Complete problems can only be arrived at in polynomial time. In layman's terms, neural networks can "solve" problems that normal computation can't; but you can only be a certain percentage confident that you have the right answer. You can determine that confidence by the training regimen, and can usually make sure that you will have at least 99.9% confidence.

McWafflestix
The question itself wasn't meant to question the theory itself, or how it works, but rather to whether or not the coding behind the neural network actually matches to the theory.
supercheetah
I didn't assume you were questioning the theory; was just trying to answer your question as asked.
McWafflestix
+5  A: 

Correctness is a funny concept in most of "soft computing." The best I can tell you is: "a neural network is correct when it consistently satisfies the parameters of it's design." You do this by training it with data, and then verifying with other data, and having a feedback loop in the middle which lets you know if the neural network is functioning appropriately.

This is of-course the case only for neural networks that are large enough where a direct proof of correctness is not possible. It is possible to prove that a neural network is correct through analysis if you are attempting to build a neural network that learns XOR or something similar, but for that class of problem an aNN is seldom necessary.

earino
True; the issue of proof of "correctness" gets VERY complicated very quickly when dealing with non-trivial neural networks, to the point of being intractable for any significantly sized networks.
McWafflestix
+1  A: 

I've worked on projects where there is test data as well as training data, so you know the expected outputs for a set of inputs the NN hasn't seen.

One common way of analysing the result of any classifier is use of an ROC curve; an introduction to the statistics of classifiers and ROC curves can be found at Interpreting Diagnostic Tests

Pete Kirkham
Very good point; you always want to have a set of known, novel data to present to "validate" the network. It's not really "validating" it as much as it is "confirming" it, really.
McWafflestix
+1  A: 

You're opening up a bigger can of worms here than you might expect.

NN's are perhaps best thought of as universal function approximators, by the way, which may help you in thinking about this stuff.

Anyway, there is nothing special about NN's in terms of your question, the problem applies to any sort of learning algorithm.

The confidence you have in the results it is giving is going to rely on both the quantity and the quality (often harder to determine) of the training data that you have.

If you're really interested in this stuff, you may want to read up a bit on the problems of overtraining, and ensemble methods (bagging, boosting, etc.).

The real problem is that you usually aren't actually interested in the "correctness" (cf quality) of an answer on a given input that you've already seen, rather you care about predicting the quality of answer on an input you haven't seen yet. This is a much more difficult problem. Typical approaches then, involve "holding back" some of your training data (i.e. the stuff you know the "correct" answer for) and testing your trained system against that. It gets subtle though, when you start considering that you may not have enough data, or it may be biased, etc. So there are many researchers who basically spend all of their time thinking about these sort of issues!

simon
+1  A: 

I don't believe there is a single correct answer but there are well-proven probabilistic or statistical methods that can provide reassurance. The statistical methods are usually referred to as Resampling.

One method that I can recommend is the Jackknife.

Thomas Bratt
A: 

My teacher always said his rule of thumb was to train the NN with 80% of your data and validate it with the other 20%. And, of course, make sure that data set is as comprehensive as you need.

Stephen Friederichs
A: 

If you want to find out whether the backpropagation of the network is correct, there is an easy way.

Since you calculate the derivate of the error landscape, you can check whether your implementation is correct numerically. You will calculate the derivative of the error with respect to a specific weight, ∂E/∂w. You can show that

∂E/∂w = (E(w + e) - E(w - e)) / (2 * e) + O(e^2).

(Bishop, Machine Learning and Pattern Recognition, p. 246)

Essentially, you evaluate the error to the left of the weight, evaluate it to the right of the weight and chheck if the numerical gradient is the same as your analytical gradient.

(Here's an implementation: http://github.com/bayerj/arac/raw/9f5b225d6293974f8adfc5f20dfc6439cc1bed35/src/cpp/utilities/utilities.cpp)

bayer
+1  A: 

Here is educational free software in which you can verify during learining (test for learning correctness):

Sharky Neural Network

SharkTime