views:

51

answers:

1

I have a graph (network) which consists of layers, which contains nodes (neurons). I would like to write a procedure to duplicate entire graph in most elegant way possible -- i.e. with minimal or no overhead added to the structure of the node or layer.

Or yet in other words -- the procedure could be complex, but the complexity should not "leak" to structures. They should be no complex just because they are copyable.

I wrote the code in C#, so far it looks like this:

  • neuron has additional field -- copy_of which is pointer the the neuron which base copied from, this is my additional overhead
  • neuron has parameterless method Clone()
  • neuron has method Reconnect() -- which exchanges connection from "source" neuron (parameter) to "target" neuron (parameter)
  • layer has parameterless method Clone() -- it simply call Clone() for all neurons
  • network has parameterless method Clone() -- it calls Clone() for every layer and then it iterates over all neurons and creates mappings neuron=>copy_of and then calls Reconnect to exchange all the "wiring"

I hope my approach is clear. The question is -- is there more elegant method, I particularly don't like keeping extra pointer in neuron class just in case of being copied! I would like to gather the data in one point (network's Clone) and then dispose it completely (Clone method cannot have an argument though).

+1  A: 

Use a hash table for copying a general graph:

h = new HashTable()
def copyAll(node):
   if h has key node: return h[node]
   copy = node.copy()
   h[node] = copy
   for each successor of node:
     copy.addSuccessor(copy(successor))
   return copy

Your particular graph seems to be acyclic with special structure so you don't need a hash table (you can use an array instead) and the approach you are describing seems to be the best way to copy it.

If you are writing a neural network you should just use vectors and matrices of floats to represent the neurons. It may seem less elegant now, but trust me it's much more elegant (and several orders of magnitude faster too).

Consider a neural network with 2 layers, the input (n nodes) and the output (m nodes). Now suppose we have a vector of floats called in that represents the values of the input layer, and we want to compute a vector called out that represents the values of the output layer. The neural network itself consists of an n by m matrix M of floats. M[i][j] represents how strong the connection between input node i and output node j is. The beauty is that evaluating a network is the same as matrix multiplication followed by applying the activation function to every element of the result vector:

out = f(M*in)

Where f is the activation function and where * is matrix multiplication. This is neural network evaluation in 1 line! You cannot get it this elegant with OO design of a neural network.

Jules
I don't see how it works. Let's say you have nodes A and B connected. You will duplicate A pretty much correctly to A' (now with B' connected to). But then you have to duplicate B -- it's B' already, but it is empty and you cannot add connections to it because you lost the mapping that B' is a copy of B.Besides you cannot add successor just like that because it is node from another layer (in general: it is node from any layer).
macias
You didn't lose the mapping, the mapping is explicitly stored in the hash table h. You can do something similar for your particular graph structure. Instead of having a copy_of field in each node you have a copy_of hash table, so that copy_of[node] returns which node it's a copy of.
Jules
But again, this is not a good way to go about implementing a neural network. I've been there and done that and the array-of-floats approach is much better in every way :)
Jules
ad.array of floats) I don't have one global f. Each neuron can differ a lot -- add a matrix for every attribute? Errors, activation function and its derivative, and so on?ad.duplicating) still, there is problem with recreating layers -- here the node should recreate the layer, so there should be another hashtable for layers. The parameter may be somehow omitted but then you have to introduce another global variable -- and this causes problem in concurrent execution. So far, I think it is better to pay the price of having additional pointer per structure.
macias
If you don't have one global f you'd have a vector of f's instead of one f. Yes you'd use a matrix or vector for every attribute.I'd have the network first copy all layers, the layers copy all nodes in the layer (without establishing connections yet). After that you re-establish the connections by using the copy_of hash table. If you are worried about performance you should definitely not be copying the network in the first place. Why do you need to copy it?
Jules
I am not worrying about performance of code but its elegance. The reason for copying -- I have several instances to measure the performance of NN (part of genetic algorithm), copying is initial step of mutation.
macias
Oh, ok then I suppose you can't avoid copying. I still suggest you consider using arrays of floats because copying them is trivial ;)
Jules
And if you want to use objects, then here is a suggestion. I assume your layer structure has an array of nodes that are in that layer? Instead of letting the nodes point to their successors with pointers, you can use integer indices of the layer array. In this way you avoid the problem completely because now you can just copy the indices because they will be the same in the original and in the copy.
Jules
Keeping indices is no go -- mutation adds and deletes nodes. Pointers are immutable for such changes. I still insist of using proper OOP because I would like to avoid mess when adding some "attributes" later (which I cannot predict it now).
macias
In that case I guess you have to go for the ugly approach with either a hash table or with a copy_of field. Or perhaps you can "delete" nodes by setting all its weights to 0?
Jules
I cannot replace deleting by setting weight to zero, because in learning such "deleted" nodes would reappear :-)
macias
You could perhaps use an extra field "deleted", but this isn't elegant either...
Jules