ansaurus

Question

Help me with my backprop implementation in Python

Answer 1

+3 A:

Looks like what you get is nearly the initial state of Neuron (nearly self.idealValue). Maybe you should not initialize this Neuron before having actual data to provide ?

EDIT: OK, I looked a bit deeper in the code and simplified it a bit (will post simplified version below). Basically your code has two minor errors (looks like things you just overlooked), but that leads to a network that definitely won't work.

you forgot to set value of expectedOutput in output layer while in learning phase. Without that the network definitely can't learn anything and will always be stuck with initial idealValue. (That is the bahavior that I spotted at first reading). This one could even have been spotted in your description of the training steps (and probably would have if you hadn't posted the code, this is one of the rare case I know where actually posting the code was hiding the error instead of making it obvious). You fixed this one after your EDIT1.
when activating network in calculateSingleOutputs, you forgot to activate the hidden layer.

Obviously any of these two problems will lead to a disfonctional network.

Once corrected, it works (well, it does in my simplified version of your code).

The errors were not easy to spot because the initial code was much too complicated. You should think twice before introducing new classes or new methods. Not creating enough methods or classes will make code hard to read and to maintain, but creating too many may make it even harder to read and maintain. You have to find the right balance. My personal method to find this balance is to follow code smells and refactoring techniques wherever they lead me. Sometimes adding methods or creating classes, sometimes removing them. It's certainly not perfect but that's what works for me.

Below is my version of code after some refactoring applied. I spent about one hour changing your code but always keeping it functionaly equivalent. I took that as a good refactoring exercise, as the initial code was really horrible to read. After refactoring it just took 5 minutes to spot the problems.

import os
import math

"""
A simple backprop neural network. It has 3 layers:
    Input layer: 2 neurons
    Hidden layer: 2 neurons
    Output layer: 1 neuron
"""

class Weight:
    """
    Class representing a weight between two neurons
    """
    def __init__(self, value, from_neuron, to_neuron):
        self.value = value
        self.from_neuron = from_neuron
        from_neuron.outputWeights.append(self)
        self.to_neuron = to_neuron
        to_neuron.inputWeights.append(self)

        # delta value, this will accumulate and after each training cycle
        # will be used to adjust the weight value
        self.delta = 0.0

class Neuron:
    """
    Class representing a neuron.
    """
    def __init__(self):
        self.value = 0.0        # the output
        self.idealValue = 0.0   # the ideal output
        self.error = 0.0        # error between output and ideal output
        self.inputWeights = []    # weights that end in the neuron
        self.outputWeights = []  # weights that starts in the neuron

    def activate(self):
        """
        Calculate an activation function of a neuron which is 
        a sum of all input weights * neurons where those weights start
        """
        x = 0.0;
        for weight in self.inputWeights:
            x += weight.value * weight.from_neuron.value
        # sigmoid function
        self.value = 1 / (1 + math.exp(-x))

class Network:
    """
    Class representing a whole neural network. Contains layers.
    """
    def __init__(self, layers, learningRate, weights):
        self.layers = layers
        self.learningRate = learningRate    # the rate at which the network learns
        self.weights = weights

    def training(self, entries, expectedOutput):
        for i in range(len(entries)):
            self.layers[0][i].value = entries[i]
        for i in range(len(expectedOutput)):
            self.layers[2][i].idealValue = expectedOutput[i]
        for layer in self.layers[1:]:
            for n in layer:
                n.activate()
        for n in self.layers[2]:
            error = (n.idealValue - n.value) * n.value * (1 - n.value)
            n.error = error
        for n in self.layers[1]:
            error = 0.0
            for w in n.outputWeights:
                error += w.to_neuron.error * w.value
            n.error = error
        for w in self.weights:
            w.delta += w.from_neuron.value * w.to_neuron.error

    def updateWeights(self):
        for w in self.weights:
            w.value += self.learningRate * w.delta

    def calculateSingleOutput(self, entries):
        """
        Calculate a single output for input values.
        This will be used to debug the already learned network after training.
        """
        for i in range(len(entries)):
            self.layers[0][i].value = entries[i]
        # activation function for output layer
        for layer in self.layers[1:]:
            for n in layer:
                n.activate()
        print self.layers[2][0].value


#------------------------------ initialize objects etc

neurons = [Neuron() for n in range(5)]

w1 = Weight(-0.79, neurons[0], neurons[2])
w2 = Weight( 0.51, neurons[0], neurons[3])
w3 = Weight( 0.27, neurons[1], neurons[2])
w4 = Weight(-0.48, neurons[1], neurons[3])
w5 = Weight(-0.33, neurons[2], neurons[4])
w6 = Weight( 0.09, neurons[3], neurons[4])

weights = [w1, w2, w3, w4, w5, w6]
inputLayer  = [neurons[0], neurons[1]]
hiddenLayer = [neurons[2], neurons[3]]
outputLayer = [neurons[4]]
learningRate = 0.3
network = Network([inputLayer, hiddenLayer, outputLayer], learningRate, weights)

# just for debugging, the real training set is much larger
trainingSet = [([0.0,0.0],[0.0]),
               ([1.0,0.0],[1.0]),
               ([2.0,0.0],[1.0]),
               ([0.0,1.0],[0.0]),
               ([1.0,1.0],[1.0]),
               ([2.0,1.0],[0.0]),
               ([0.0,2.0],[0.0]),
               ([1.0,2.0],[0.0]),
               ([2.0,2.0],[1.0])]

#------------------------------ let's train
for i in range(100): # training iterations
    for entries, expectedOutput in trainingSet:
        network.training(entries, expectedOutput)
    network.updateWeights()

#network has learned, let's check
network.calculateSingleOutput((1, 0)) # this should be close to 1
network.calculateSingleOutput((0, 0)) # this should be close to 0

By the way there is still a third problem I didn't corrected (but easy to correct). If x is too big or too small (> 320 or < -320) math.exp() will raise an exception. This will occur if you apply for training iterations, say a few thousands. The most simple way to correct that I see is to check for value of x and if it's too big or too small set Neuron's value to 0 or 1 depending on the case, which is the limit value.

kriss 2010-10-21 14:53:54

Well, I will try that tomorrow.

Richard Knop 2010-10-21 20:28:17

Thanks very much. Yeah, I guess I overcomplicated it. I just wanted to avoid procedural programming and do everything in OOP and I got carried away.

Richard Knop 2010-10-22 07:40:57

By the way, try print network.calculateSingleOutput(2.0, 1.0). it will print incorrect output :)

Richard Knop 2010-10-23 22:27:16

@Richard Knop: do you mean `network.calculateSingleOutput([2.0, 1.0])` ? (the entries parameter expect only one input that is a list of two numbers, your version will give a syntax error). With 100 learning iterations it yield: 0.04, not exactly zero as expected, but neither something I would call incorrect, it's still close to zero.

kriss 2010-10-24 02:35:06

@Richard Knop: OK, I got it. It works with my version, not with yours. I guess it's something that changed with EDIT1, as I refactored from the initial version. The reason of the problem is (again) not obvious to me, you will have to check the difference by yourself.

kriss 2010-10-24 02:41:37

Thanks. I will go line by line and try to find the difference.

Richard Knop 2010-10-24 10:46:23

But still I think your version is not working correctly. This should return 1 network.calculateSingleOutput((1, 1)) as should this network.calculateSingleOutput((2, 2)).

Richard Knop 2010-10-24 10:50:24

But anyways, I got mine code working, I am doing some refactoring now. I will post the final version later.

Richard Knop 2010-10-24 12:43:20

@Richard Knop: yes, there is something wrong in my version. I will also check and post corrected version.

kriss 2010-10-24 13:06:45

Damn. I've found out my code does not work correctly either :) I will have to investigate where is an error in my code. It does the same incorrect behavior as yours.

Richard Knop 2010-10-31 14:07:42

ansaurus

tags:

views:

answers:

Help me with my backprop implementation in Python

related questions