views:

145

answers:

1

There are two parameters while using RBF kernels with Support Vector Machines: C and γ. It is not known beforehand which C and γ are the best for one problem; consequently some kind of model selection (parameter search) must be done. The goal is to identify good (C;γ) so that the classier can accurately predict unknown data (i.e., testing data). weka.classifiers.meta.GridSearch is a meta-classifier for tuning a pair of parameters. It seems, however, that it takes ages to finish (whem the dataset is rather large). What would you suggest to do in order to bring down the time required to accomplish this task?

+2  A: 

Hastie et al.'s SVMPath explores the entire regularization path for C and only requires about the same computational cost of training a single SVM model. From their paper:

Our R function SvmPath computes all 632 steps in the mixture example (n+ = n− = 100, radial kernel, γ = 1) in 1.44(0.02) secs on a pentium 4, 2Ghz linux machine; the svm function (using the optimized code libsvm, from the R library e1071) takes 9.28(0.06) seconds to compute the solution at 10 points along the path. Hence it takes our procedure about 50% more time to compute the entire path, than it costs libsvm to compute a typical single solution.

They released a GPLed implementation of the algorithm in R that you can download from CRAN here.

Using SVMPath should allow you to find a good C value for any given γ quickly. However, you would still need to do separate training runs for different γ values. But, this should be much faster than doing separate runs for each pair of C:γ values.

dmcer