views:

144

answers:

2

When creating a libsvm training file, how do you differentiate between a nominal attribute verses a numeric attribute? I'm trying to encode certain nominal attributes as integers, but I want to ensure libsvm doesn't misinterpret them as numeric values. Unfortunately, libsvm's site seems to have very little documentation. Pentaho's docs seem to imply libsvm makes this distinction, but I'm still not clear how it's made.

+4  A: 

Don't do this I'm trying to encode certain nominal attributes as integers.

Rather, use a separate binary feature for each value of each nominal attribute.

The way SVMs are formulated, all attributes/features are numeric and class labels are nominal. Nominal attributes are essentially faked by using mutually exclusive binary features.

dmcer
A: 

I think you cant do that in libsvm, weka or SVM-light. One approach that you could use is to use something like a decision tree for your nominal attributes and svm or any distance based classifier for your numeric attributes and then combine the results. I hope it helps.

Ankur