libsvm

Binarization in Natural Language Processing

Binarization is the act of transforming colorful features of of an entity into vectors of numbers, most often binary vectors, to make good examples for classifier algorithms. If we where to binarize the sentence "The cat ate the dog", we could start by assigning every word an ID (for example cat-1, ate-2, the-3, dog-4) and then simply r...

Pointers to some good SVM Tutorial ...

Hi all, I have been trying to grasp the basics of Support Vector Machines, and downloaded and read many online articles. But still am not able to grasp it. I would like to know, if there are some nice tutorial sample code which can be used for understanding or something, that you can think of, and that will enable me to learn SVM B...

Know any good c++ support vector machine (SVM) libraries ?

Hey everyone, Do you know of any good c++ svm libraries out there ? I tried libsvm (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) but so far I'm not flabbergasted (no documentation, or close to none). I have also heard of SVMLight and TinySVM. Have you tried them ? Any new players ? Thanks ! JC ...

libsvm model file format

According to this FAQ the model format in libsvm should be straightforward. And in fact it is, when I call just svm-train. As an example, the first SV for the a1a dataset is 1 3:1 11:1 14:1 19:1 39:1 42:1 55:1 64:1 67:1 73:1 75:1 76:1 80:1 83:1 On the other hand, if I use the easy.py script, my first SV ends up being: 512 1:-1 2:-1...

How to do multi class classification using Support Vector Machines (SVM).

Hello, In every book and example always they show only binary classification (two classes) and new vector can belong to any one class. Here the problem is I have 4 classes(c1, c2, c3, c4). I've training data for 4 classes. For new vector the output should be like C1 80% (the winner) c2 10% c3 6% c4 4% How to do th...

libSVM automated labeller script

Hi Is there any script that would transform a tab delimited data file into libSVM data format? For an example my unlabelled data: -1 9.45 1.44 8.90 -1 8.12 7.11 8.90 -1 8.11 6.12 8.78 and I would like to append each value with a label: -1 1:9.45 2:1.44 3:8.90 -1 1:8.12 2:7.11 3:8.90 -1 1:8.11 2:6.12 3:8.78 I believe this can b...

How to call two C programs from within one C program ?

How can I call two C applications from within another C application? e.g. : pg1.c can be run as ./a.out pg1_args pg2.c can be run as ./a.out pg2_args I would like to write a program that can be run as: ./a.out pg1_args pg2_args With the result being equivalent to : ./a.out pg1_args ./a.out pg2_args ./a.out pg1_args ./a.out ...

training time and overfitting with gamma and C in libsvm

Hi, I am now using libsvm for support vector machine classifier with Gaussian kernal. In its website, it provides a python script grid.py to select the best C and gamma. I just wonder how training time and overfitting/underfitting change with gamma and C? Is it correct that: suppose C changes from 0 to +infinity, the trained model wi...

what this python code trying to do

Hi, The following python code is to traverse a 2D grid of (c, g) in some special order, which is stored in "jobs" and "job_queue". But I am not sure which kind of order it is after trying to understand the code. Is someone able to tell about the order and give some explanation for the purpose of each function? Thanks and regards! impo...

training for classification using libsvm

Hello all, I want to classify using libsvm. I have 9 training sets , each set has 144000 labelled instances , each instance having a variable number of features. It is taking about 12 hours to train one set ( ./svm-train with probability estimates ). As i dont have much time , I would like to run more than one set at a time. I'm not su...

how much time does grid.py take to run ?

Hello all , I am using libsvm for binary classification.. I wanted to try grid.py , as it is said to improve results.. I ran this script for five files in separate terminals , and the script has been running for more than 12 hours.. this is the state of my 5 terminals now : [root@localhost tools]# python grid.py sarts_nonarts_feat.txt...

Precomputed Kernels with LibSVM in Python

I've been searching the net for ~3 hours but I couldn't find a solution yet. I want to give a precomputed kernel to libsvm and classify a dataset, but: How can I generate a precomputed kernel? (for example, what is the basic precomputed kernel for Iris data?) In the libsvm documentation, it is stated that: For precomputed kernels, the...

Calculating Nearest Match to Mean/Stddev Pair With LibSVM

I'm new to SVMs, and I'm trying to use the Python interface to libsvm to classify a sample containing a mean and stddev. However, I'm getting nonsensical results. Is this task inappropriate for SVMs or is there an error in my use of libsvm? Below is the simple Python script I'm using to test: #!/usr/bin/env python # Simple classifier t...

Nominal Attributes in LibSVM

When creating a libsvm training file, how do you differentiate between a nominal attribute verses a numeric attribute? I'm trying to encode certain nominal attributes as integers, but I want to ensure libsvm doesn't misinterpret them as numeric values. Unfortunately, libsvm's site seems to have very little documentation. Pentaho's docs s...

SVM Visualization in MATLAB

How do I visualize the SVM classification once I perform SVM training in Matlab? ...

Save PyML.classifiers.multi.OneAgainstRest(SVM()) object?

I'm using PYML to construct a multiclass linear support vector machine (SVM). After training the SVM, I would like to be able to save the classifier, so that on subsequent runs I can use the classifier right away without retraining. Unfortunately, the .save() function is not implemented for that classifier, and attempting to pickle it (b...

How to figure out optimal C / Gamma parameters in libsvm?

I'm using libsvm for multi-class classification of datasets with a large number of features/attributes (around 5,800 per each item). I'd like to choose better parameters for C and Gamma than the defaults I am currently using. I've already tried running easy.py, but for the datasets I'm using, the estimated time is near forever (ran ...

How to compute the probability of a multi-class prediction using libsvm?

I'm using libsvm and the documentation leads me to believe that there's a way to output the believed probability of an output classification's accuracy. Is this so? And if so, can anyone provide a clear example of how to do it in code? Currently, I'm using the Java libraries in the following manner SvmModel model = Svm.svm_train(...

help with representing textual data in the format suitable for SVM's more specifically libsvm

Hi, My problem at hand is, I need to be able to classify agricultural web pages from not agricultural web pages. This is oriented towards building a focused crawler that only crawls and indexes mostly agricultural pages. I need advice from any person whose experienced with working with SVM's? Would considering the SVM classifier be appr...

help with LibSVM input data

Hi, I am using the LibSVM tool for my support vector classification implementation:- The first line in my input data file looks as so:- +1 15752:47 6279:45 475:40 5231:30 515:29 7529:28 11623:24 274:24 15431:21 7342:20 4819:20 7598:18 8853:17 11134:16 501:16 911:15 4656:15 5875:14 10725:13 7334:13 13762:13 8295:12 9314:12 317:12 10641...