I have a couple of questions about how to code the backpropagation algorithm of neural networks:
The topology of my networks is an input layer, hidden layer and output layer. Both the hidden layer and output layer have sigmoid functions.
- First of all, should I use the bias? To where should I connect the bias in my network? Should I put one bias unit per layer in both the hidden layer and output layer? What about the input layer?
- In this link, they define the last delta as the input - output and they backpropagate the deltas as can be seen in the figure. They hold a table to put all the deltas before actually propagating the errors in a feedforward fashion. Is this a departure from the standard backpropagation algorithm?
- Should I decrease the learning factor over time?
- In case anyone knows, is Resilient Propagation an online or batch learning technique?
Thanks
edit: One more thing. In the following picture, d f1(e) / de, assuming I'm using the sigmoid function, is f1(e) * [1- f1(e)], right?