views:

53

answers:

2

Hi

I am trying to write a bag of features system image recognition system. One step in the algorithm is to take a larger number of small image patches (say 7x7 or 11x11 pixels) and try to cluster them into groups that look similar. I get my patches from an image, turn them into gray-scale floating point image patches, and then try to get cvKMeans2 to cluster them for me. I think I am having problems formatting the input data such that KMeans2 returns coherent results. I have used KMeans for 2D and 3D clustering before but 49D clustering seems to be a different beast.

I keep getting garbage values for the returned clusters vector, so obviously this is a garbage in / garbage out type problem. Additionally the algorithm runs way faster than I think it should for such a huge data set.

In the code below the straight memcpy is only my latest attempt at getting the input data in the correct format, I spent a while using the built in OpenCV functions, but this is difficult when your base type is CV_32FC(49).

Can OpenCV 1.1's KMeans algorithm support this sort of high dimensional analysis?

Does someone know the correct method of copying from images to the K-Means input matrix?

Can someone point me to a free, Non-GPL KMeans algorithm I can use instead?

This isn't the best code as I am just trying to get things to work right now:

    std::vector<int> DoKMeans(std::vector<IplImage *>& chunks){
 // the size of one image patch, CELL_SIZE = 7
 int chunk_size = CELL_SIZE*CELL_SIZE*sizeof(float);
 // create the input data, CV_32FC(49) is 7x7 float object (I think)
 CvMat* data = cvCreateMat(chunks.size(),1,CV_32FC(49) );


 // Create a temporary vector to hold our data
 // we'll copy into the matrix for KMeans
 int rdsize = chunks.size()*CELL_SIZE*CELL_SIZE;
 float * rawdata = new float[rdsize];

 // Go through each image chunk and copy the 
 // pixel values into the raw data array.
 vector<IplImage*>::iterator iter;
 int k = 0;
 for( iter = chunks.begin(); iter != chunks.end(); ++iter )
 {

  for( int i =0; i < CELL_SIZE; i++)
  {
   for( int j=0; j < CELL_SIZE; j++)
   {
    CvScalar val;
    val = cvGet2D(*iter,i,j);
    rawdata[k] = (float)val.val[0];
    k++;
   }

  }
 }

 // Copy the data into the CvMat for KMeans
 // I have tried various methods, but this is just the latest.
 memcpy( data->data.ptr,rawdata,rdsize*sizeof(float));

 // Create the output array
 CvMat* results = cvCreateMat(chunks.size(),1,CV_32SC1);

 // Do KMeans
 int r = cvKMeans2(data, 128,results, cvTermCriteria(CV_TERMCRIT_EPS+CV_TERMCRIT_ITER, 1000, 0.1));

 // Copy the grouping information to our output vector
 vector<int> retVal;
 for( int y = 0; y < chunks.size(); y++ )
 {
  CvScalar cvs = cvGet1D(results, y);
  int g =  (int)cvs.val[0];
  retVal.push_back(g);
 }

 return retVal;}

Thanks in advance!

A: 

You might like to check out http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/ for another open source clustering package.

Using memcpy like this seems suspect, because when you do:

 int rdsize = chunks.size()*CELL_SIZE*CELL_SIZE;

If CELL_SIZE and chunks.size() are very large you are creating something large in rdsize. If this is bigger than the largest storable integer you may have a problem.

Are you wanting to change "chunks" in this function? I'm guessing that you don't as this is a K-means problem.

So try passing by reference to const here. (And generally speaking this is what you will want to be doing)

so instead of:

std::vector<int> DoKMeans(std::vector<IplImage *>& chunks)

it would be:

std::vector<int> DoKMeans(const std::vector<IplImage *>& chunks)

Also in this case it is better to use static_cast than the old c style casts. (for example static_cast(variable) as opposed to (float)variable ).

Also you may want to delete "rawdata":

 float * rawdata = new float[rdsize];

can be deleted with:

delete[] rawdata;

otherwise you may be leaking memory here.

shuttle87
Shuttle87,Good catch on the integer overflow on rdsize. That is not a problem now but it will be later. As for the other little errors (i.e. not cleaning up, const correctness) I am just trying to see if this works or not, I will certainly re-write/cleanup everything when I am done. I will look into this other clustering package you pointed out.
kscottz
It just occurred to me that if the storage of the data sizes is a problem then it might also be a problem within the internals of the library you are using. Perhaps there is some value in posting this on an OpenCV place as their libraries may not be able to support the size of data you are using.
shuttle87
I have only see cvKMeans do up to three dimensions and the way you need to pack the data is really wonky. I would assume the input matrix would be a linear representation of something that is sXd, where s is the number of samples and d is the dimension of the point . Anyway I have tried about ten different approaches and nothing has worked, so I am using a GPL KMeans algorithm I found on the web and it appears to work great. I would like to find something with a more open license. I have been using openCV for awhile but I not aware of an open forum for questions.
kscottz
A: 

Though I'm not familiar with "bag of features", have you considered using feature points like corner detectors and SIFT?

rwong