views:

90

answers:

1

With StompChicken's corrections (I miscomputed one dot product, ugh!) the answer appears to be yes. I have since tested the same problem using a precomputed kernel with the same correct results. If you are using libsvm StompChickens clear, organized computations are a very nice check.

Original Question: I am about to start using precomputed kernels in libSVM. I had noticed Vlad's answer to a question and I thought it would be wise to confirm that libsvm gave correct answers. I started with non-precomputed kernels, just a simple linear kernel with 2 classes and three data points in 3 dimensional space. I used the data

1 1:3 2:1 3:0
2 1:3 2:3 3:1
1 1:7 3:9

The model file generated by a call to svm-train -s 0 - t 0 contains

svm_type c_svc
kernel_type linear
nr_class 2
total_sv 3
rho -1.53951
label 1 2
nr_sv 2 1
SV
0.4126650675419768 1:3 2:1 3:0 
0.03174528241667363 1:7 3:9 
-0.4444103499586504 1:3 2:3 3:1 

However when I compute the solution by hand that is not what I get. Does anyone know whether libsvm suffers from errors or can anyone compare notes and see whether they get the same thing libsvm does?

The coefficients a1, a2, a3 returned by libsvm are should be the values that make

a1 + a2 + a3 - 5*a1*a1 + 12*a1*a2 - 21*a1*a3 - 19*a2*a2/2 + 21*a2*a3 - 65*a3*a3 

as large as possible with the restrictions that a1 + a3 = a2 and each of a1, a2, a3 is required to lie between 0 and 1 (the default value of C).

The above model file says the answer is

a1 = .412665...
a2 = .444410...
a3 = .031745...

But one just has to substitute a2 = a1 + a3 into the big formula above and confirm both partial derivatives are zero to see if this solution is correct (since none of a1,a2,a3 is 0 or 1) but they are not zero.

Am I doing something wrong, or is libsvm giving bad results? (I am hoping I am doing something wrong.)

+6  A: 

LibSVM is a very widely used library and I highly doubt anything is drastically wrong with the code. That said, I think it's great that there are people who are paranoid enough to actually check it for correctness - well done!

The solution seems correct according to the working that I give below. What I mean by that is it satisfies the KKT conditions (15.29). It also true that the partial derivatives of the dual vanish at the solution.

Here's my working...

x1 = (3,1,0)  x2 = (3,3,1)  x3 = (7,0,9)
y1 = -1       y2 = 1        y3 = -1

K = [10   12   21]
    [12   19   30]
    [21   30  130]

L_dual = a1 + a2 + a3 -5a1^2 + 12a1a2 - 21a1a3 - (19/2)a2^2 + 30a2a3 - 65a3^2)

a1 = 0.412  a2 = 0.4444  a3 = 0.0317

Checking KKT:
y1.f(x1) = y1 * (y1*a1*K(x1,x1) + y2*a2*K(x1,x2) + y3*a3*k(x1,x3) - rho)
         = rho + 10*a1 + 21*a3 - 12*a2
         ~= 1
(Similar for the x2 and x3)

Substituting a2 = a1 + a3 into L_dual:
L_dual = 2a1 + 2a3 - 2.5a1^2 + 2a1a3 - 44.5a3^2
dL/da1 = 2 - 5a1 + 2a3 = 0
dL/da3 = 2 + 2a1 - 89a3 = 0
StompChicken