views:

174

answers:

1

Weka is probably the most popular general purpose machine learning library. But it can be quite slow in my experience.

I have been looking at Shark, Waffles, dlib, Plearn, and MLC++ as alternatives. Of these, Shark and dlib look the most promising.

Does anyone have any experience when it comes to performance testing of these libraries?

+4  A: 

For me, what matters most would be "Does this toolkit have the algorithm or feature I want to try out?" Since these toolkits provide a fairly diverse set of features you should first try to narrow down what it is you want to do.

So for example, if you have a burning desire to try out different evolutionary optimization algorithms then I would go with something like Shark.

On the other hand, I prefer dlib for most of my work but that doesn't necessarily mean a lot since I wrote it :) However, if you are interested in binary classification then let me suggest my current favorite method for that, the svm_c_ekm_trainer. I frequently use this to train non-linear SVMs on datasets of hundreds of thousands of points. It usually runs in a few minutes (or sometimes even seconds) while the classic SMO algorithm for this would take hours or days to finish.

There were also some good answers to a similar question asked not too long ago: Which machine learning library to use.

Davis King
Thanks Davis! Great job with dlib! I'm really just looking to find something that has a lot of functionality so that I can use that most of the time, but would branch out into other things as needed.
griffin