views:

266

answers:

3

For my graduate research I am creating a neural network that trains to recognize images. I am going much more complex than just taking a grid of RGB values, downsampling, and and sending them to the input of the network, like many examples do. I actually use over 100 independently trained neural networks that detect features, such as lines, shading patterns, etc. Much more like the human eye, and it works really well so far! The problem is I have quite a bit of training data. I show it over 100 examples of what a car looks like. Then 100 examples of what a person looks like. Then over 100 of what a dog looks like, etc. This is quite a bit of training data! Currently I am running at about one week to train the network. This is kind of killing my progress, as I need to adjust and retrain.

I am using Neuroph, as the low-level neural network API. I am running a dual-quadcore machine(16 cores with hyperthreading), so this should be fast. My processor percent is at only 5%. Are there any tricks on Neuroph performance? Or Java performance in general? Suggestions? I am a cognitive psych doctoral student, and I am decent as a programmer, but do not know a great deal about performance programming.

+6  A: 

Yeah I went down that road a few months ago. Also for a university project. First problem is Neuroph. Its deadly slow. Neuroph has well know major architectural and performance issues, there was just an article about that last week on code project.

http://www.codeproject.com/KB/recipes/benchmark-neuroph-encog.aspx

I followed a similar path as the author of this article. Switching from Neuroph to Encog is a real easy port. The author of the above article even has another that compares the syntax of Encog, JOONE and Neuroph, so you can compare that. For more info on Encog,

http://www.heatonresearch.com/encog

Encog will take more advantage of your cores too. Just look at the chart in the above article.

Good luck! Your research sounds really awesome, I would love to see the results.

Miley
Thanks, for the quick response. That looks like a great article. And great advice on Neuroph.
+2  A: 

Also look at your training method. Multicores help you work FASTER. Working smarter is good too. If you are just using backpropagation you are going to take a long time to converge. At a minimum use something like resilient propagation, I think that might be in Neuroph. Or look at Scaled Conjugate Gradient or Levenberg Marquardt. Encog does both of these. Encog can also use your GPU to even further speed things using OpenCL.

Speeding up iterations is good. But doing MORE with a training iteration is often even better. And doing BOTH is the best of all.

How independent are your neural networks? Honestly, I am the main Encog programmer and I would love to see you switch. BUT, if you are under a time crunch and need to stay Neuroph and those nets ARE truly independent, then you might be able to spawn multiple threads and have several Neuroph training loops going at once. Over all of your cores. Assuming there is nothing in Neuroph that is going to goof up when there are several instance of their trainer going at once. I don't know Neuroph well enough to say how re-entrant it is.

Also I agree, your research sounds really interesting.

JeffHeaton
A: 

Hi,

Are you training from GUI or Java code and which version of Neuroph are you using? If you're using GUI take the latest updated version 2.4u1 (just uploaded it) it has some performance improvements. Also which training algorithm you're using, and what settings? You can try the DynamicBackpropagation. Your project sounds very interesting and i'm really sorry you're having issues with Neuroph. We were not aware that Neuroph performance is that low compared to others, before these benchmarks, and we'll improve that for sure in future.

Like the Jeff suggested (thanks Jeff) if your networks are independent you could do something like this:

for(int index = 0; index < numberThreads ; index++ ) {

MultiLayerPerceptron mlp = new MultiLayerPerceptron(inputSize, hiddenLayerSize,outputSize);

SupervisedLearning learningRule = (SupervisedLearning)mlp.getLearningRule(); learningRule.setMaxError(maxError); learningRule.setMaxIterations(maxIterations); // make sure we can end. learningRule.addObserver(this); // user observer to tell when individual networks are done and launch new networks. this.mlpVector.add(mlp); mlp.learnInNewThread(trainingSet); }

Also since you have so many network learning parameters may be critical so you can use the Neuroph trainer to determine the right settings. Its not finished yet but basicly it generates all possible combinations of training settings and tries one by one. Hope this will help you, also if you have more questions or need help with something feel free ask.

Zoran Sevarac