views:

278

answers:

3

Can anyone recommend a website or give me a brief of how backpropagation is implemented in a NN? I understand the basic concept, but I'm unsure of how to go about writing the code.

Many of sources I've found simply show equations without giving any explanation of why they're doing it, and the variable names make it difficult to find out.

Example:

void bpnn_output_error(delta, target, output, nj, err)
double *delta, *target, *output, *err;
int nj;
{
  int j;
  double o, t, errsum;

  errsum = 0.0;
  for (j = 1; j <= nj; j++) {
    o = output[j];
    t = target[j];
    delta[j] = o * (1.0 - o) * (t - o);
    errsum += ABS(delta[j]);
  }
  *err = errsum;

}

In that example, can someone explain the purpose of

delta[j] = o * (1.0 - o) * (t - o);

Thanks.

+2  A: 

Actually, if you know the theory, the programs should be easy to understand. You can read the book and do some simple samples using a pencil to figure out the exact steps of propagation. This is a general principle for implementing numerical programs, you must know the very details in small cases.

If you know Matlab, I'd suggest you to read some Matlab source code (e.g. here), which is easier to understand than C.

For the code in your question, the names are quite self-explanatory, output may be the array of your prediction, target may be the array of training labels, delta is the error between prediction and true values, it also serves as the value to be updated into the weight vector.

Yin Zhu
+2  A: 

(t-o) is the error in the output of the network since t is the target output and o is the actual output. It is being stored in a normalized form in the delta array. The method used to normalize depends on the implementation and the o * ( 1.0 - o ) seems to be doing that (I could be wrong about that assumption).

This normalized error is accumulated for the entire training set to judge when the training is complete: usually when errsum is below some target threshold.

Adnan
A: 

Essentially, what backprop does is run the network on the training data, observe the output, then adjust the values of the nodes, going from the output nodes back to the input nodes iteratively.

Paul Nathan