ansaurus

Question

Getting a simple Neural Network to work from scratch in C++

Answer 1

+2 A:

Have a look at 15 Steps to implement a Neural Network, it should et you started.

Gregory Pakosz 2010-01-07 08:26:25

Answer 2

A:

Sounds to me like you are struggling with backprop and what you describe above doesn't quite match how I understand it to work, and your description is a bit ambiguous.

You calculate the output error term to backpropagate as the diffrence between the prediction and the actual value multiplied by the derivative of the transfer function. It is that error value which you then propagate backwards. The derivative of a sigmoid is calculated quite simply as y(1-y) where y is your output value. There are lots of proofs of that available on the web.

For a node on the inner layer, you multiply that output error by the weight between the two nodes, and sum all those products as the total error from the outer layer being propagated to the node in the inner layer. The error associated with the inner node is then multiplied by the derivative of the transfer function applied to the original output value. Here's some pseudocode:

total_error = sum(output_errors * weights)
node_error = sigmoid_derivative(node_output) * total_error

This error is then propagated backwards in the same manner right back through the input layer weights.

The weights are adjusted using these error terms and the output values of the nodes

weight_change = outer_error * inner_output_value

the learning rate is important because the weight change is calculated for every pattern/row/observation in the input data. You want to moderate the weight change for each row so the weights don't get unduly changed by any single row and so that all rows have an effect on the weights. The learning rate gives you that and you adjust the weight change by multiplying by it

weight_change = outer_error * inner_output_value * learning_rate

It is also normal to remember these changes between epochs (iterations) and to add a fraction of it to the change. The fraction added is called momentum and is supposed to speed you up through regions of the error surface where there is not much change and slow you down where there is detail.

weight_change = (outer_error * inner_output_value * learning_rate) + (last_change * momentum)

There are algorithms for adjusting the learning rate and momentum as the training proceeds.

The weight is then updated by adding the change

new_weight = old_weight + weight_change

I had a look through your code, but rather than correct it and post that I thought it was better to describe back prop for you so you can code it up yourself. If you understand it you'll be able to tune it for your circumstances too.

HTH and good luck.

Simon 2010-01-07 09:38:35

I try to fix my code with your suggestions, (I did not get the momentum in it yet) but I am still having problem with the backprop system. (To me at least) It looks like you were telling me to do the same thing but just bundle move numbers into the error value. I feel as I am missing something small but important and that is causing my backprop to not function.

Matthew 2010-01-08 03:23:33

I've tried a couple of times to figure out what your code is doing, but have given up. I think the total_error value you calculate is probably wrong because you do it for a layer and call DSigmoid twice. I suggest you do a single iteration with 2 rows of input data on paper or in Excel so you know how the whole thing works. Then get your network to spit out its weights so you can compare it to your calculations. By that stage you should understand a) what is supposed to happen and b) what's wrong with your code.

Simon 2010-01-08 12:18:03

Answer 3

A:

Check out the book under title :

AI Application Programming

Nice one to start with and have code for many AI techniques.

Ashish 2010-01-07 10:40:50

That's a bit generic - you might as well post a link to a dictionary...

jon hanson 2010-01-07 11:43:07

ansaurus

tags:

views:

answers:

Getting a simple Neural Network to work from scratch in C++

related questions