I do know there are some libraries that allow to use Support vector Machines from python code, but I am looking specifically for libraries that allow one to teach it online (this is, without having to give it all the data at once).
Are there any?
I do know there are some libraries that allow to use Support vector Machines from python code, but I am looking specifically for libraries that allow one to teach it online (this is, without having to give it all the data at once).
Are there any?
LibSVM includes a python wrapper that works via SWIG.
Example svm-test.py from their distribution:
#!/usr/bin/env python
from svm import *
# a three-class problem
labels = [0, 1, 1, 2]
samples = [[0, 0], [0, 1], [1, 0], [1, 1]]
problem = svm_problem(labels, samples);
size = len(samples)
kernels = [LINEAR, POLY, RBF]
kname = ['linear','polynomial','rbf']
param = svm_parameter(C = 10,nr_weight = 2,weight_label = [1,0],weight = [10,1])
for k in kernels:
param.kernel_type = k;
model = svm_model(problem,param)
errors = 0
for i in range(size):
prediction = model.predict(samples[i])
probability = model.predict_probability
if (labels[i] != prediction):
errors = errors + 1
print "##########################################"
print " kernel %s: error rate = %d / %d" % (kname[param.kernel_type], errors, size)
print "##########################################"
param = svm_parameter(kernel_type = RBF, C=10)
model = svm_model(problem, param)
print "##########################################"
print " Decision values of predicting %s" % (samples[0])
print "##########################################"
print "Numer of Classes:", model.get_nr_class()
d = model.predict_values(samples[0])
for i in model.get_labels():
for j in model.get_labels():
if j>i:
print "{%d, %d} = %9.5f" % (i, j, d[i,j])
param = svm_parameter(kernel_type = RBF, C=10, probability = 1)
model = svm_model(problem, param)
pred_label, pred_probability = model.predict_probability(samples[1])
print "##########################################"
print " Probability estimate of predicting %s" % (samples[1])
print "##########################################"
print "predicted class: %d" % (pred_label)
for i in model.get_labels():
print "prob(label=%d) = %f" % (i, pred_probability[i])
print "##########################################"
print " Precomputed kernels"
print "##########################################"
samples = [[1, 0, 0, 0, 0], [2, 0, 1, 0, 1], [3, 0, 0, 1, 1], [4, 0, 1, 1, 2]]
problem = svm_problem(labels, samples);
param = svm_parameter(kernel_type=PRECOMPUTED,C = 10,nr_weight = 2,weight_label = [1,0],weight = [10,1])
model = svm_model(problem, param)
pred_label = model.predict(samples[0])
Haven't heard of one. But do you really need online learning? I'm using SVMs for quite some time and never encountered a problem where i had to use online learning. Usually i set a threshold on the number of changes of training examples (maybe 100 or 1000) and then just batch-retrain all.
If your problem is at a scale, where you absolutely have to use online learning, then you might want to take a look at vowpal wabbit.
Reedited below, after comment:
Olivier Grisel suggested to use a ctypes wrapper around LaSVM. Since i didn't know about LaSVM before and it looks pretty cool, i'm intrigued to try it on my own problems :).
If you're limited to use the Python-VM only (embedded device, robot), i'd suggest to use voted/averaged perceptron, which performs close to a SVM, but is easy to implement and "online" by default.
Just saw that Elefant has some online-SVM code.
Why would you want to train it online? Adding trainings instances would usually require to re-solve the quadratic programming problem associated with the SVM.
A way to handle this is to train a SVM in batch mode, and when new data is available, check if these data points are in the [-1, +1] margin of the hyperplane. If so, retrain the SVM using all the old support vectors, and the new training data that falls in the margin.
Of course, the results can be slightly different compared to batch training on all your data, as some points can be discarded that would be support vectors later on. So again, why do you want to perform online training of you SVM?