views:

388

answers:

1

In Matlab (Neural Network Toolbox + Image Processing Toolbox), I have written a script to extract features from images and construct a "feature vector". My problem is that some features have more data than others. I don't want these features to have more significance than others with less data.

For example, I might have a feature vector made up of 9 elements:

hProjection = [12,45,19,10];
vProjection = [3,16,90,19];
area = 346;

featureVector = [hProjection, vProjection, area];

If I construct a Neural Network with featureVector as my input, the area only makes up 10% of the input data and is less significant.

I'm using a feed-forward back-propogation network with a tansig transfer function (pattern-recognition network).

How do I deal with this?

+3  A: 

When you present your input data to the network, each column of your feature vector is fed to the input layer as an attribute by itself. The only bias you have to worry about is the scale of each (ie: we usually normalize the features to the [0,1] range).

Also if you believe that the features are dependent/correlated, you might want to perform some kind of attribute selection technique. And in your case it depends one the meaning of the hProj/vProj features...


EDIT:
It just occurred to me that as an alternative to feature selection, you can use a dimensionality reduction technique (PCA/SVD, Factor Analysis, ICA, ...). For example, factor analysis can be used to extract a set of latent hidden variables upon which those hProj/vProj depends on. So instead of these 8 features, you can get 2 features such that the original 8 are a linear combination of the new two features (plus some error term). Refer to this page for a complete example

Amro
All my input is normalized to [0,1]. I'm particularly interested in a "attribute selection technique"? Do you know of any examples?All four elements in the hProj/vProj are directly correlated.
idea
Of the top of my head, there's the Correlation-based feature selection (CFS).But many other methods exist, a quick search might help:http://www.google.com/search?q=matlab+feature+selection
Amro
Thanks Amro, I think Principle Component Analysis (PCA) is what I'm looking for. I'll look at applying this technique.
idea

related questions