views:

580

answers:

5

I currently have a 5D array in the variable called template written into 1D array called template1D, with a hash table of 3456 (8 * 12 * 3 * 4 * 3) entries. In Matlab, the multi-dimensional array was accessed as follows:

template{idx_r, idx_l, idx_rho, idx_alpha, idx_beta}

However, as I have the indices going from 0-7, 0-11, 0-2, 0-3, and 0-2 respectively, I'm not entirely sure what would be the easiest way to retrieve an overall index number from these five indices in order to obtain the right segment in the template array properly. What would be the easiest method to make such hash functions properly?

+2  A: 

Have you considered using a string to hash? You could even make it a hexadecimal number,

5 characters

#Character 0 is in the range '0'...'7',
#Character 1 is in the range '0'...'B',
#Character 2 is in the range '0'...'2',
#Character 3 is in the range '0'...'3',
#Character 4 is in the range '0'...'2'

Part of the beauty of the string as a hexadecimal number is that it's got an easy int...err... unsigned long long representation as well, if you ever need to convert.

Aftermathew
Even with using a string to hash, how would I get the corresponding segment out of the template array?
stanigator
Well converting your string into 5 seperate integers, 1 for each character, would be trivial. I don't use matlab - am I missing something here?
Aftermathew
+3  A: 

If it were a 2D array arranged as a flat array, you would have multiplied the first index by the size of the first dimension and added the second index. In the same way, with 5 dimensions you would do something like:

index = (((i1*l1 + i2)*l2 + i3)*l3 + i4)*l4 + i5;
Tal Pressman
This is actually what the function SUB2IND does for you in MATLAB.
gnovice
I think the sizes may be off in your equation. For a 2-D array, you would multiply the first index by the size of the *second* dimension and add the second index. Extending to the larger-dimension array, it would be: index = (((i1*l2 + i2)*l3 + i3)*l4 + i4)*l5 + i5;
gnovice
Thanks for the suggestion. I'll let you know how my discrepancies show between your suggestion and what I obtained using sub2ind in Matlab for the same indices.
stanigator
Discrepancies fixed. Thanks gnovice.
stanigator
A: 

I don't know if I understood the question correctly so I will state what I understood:

  • You have an array of a given size that represents a multidimensional matrix.
  • You want to convert a 5 dimensional vector into the real position of the array

I would first create a class to encapsulate it, and I would provide operator() defined with 5 arguments (std::size_t, unsigned int or so). This operator() should first check ranges (maybe throw an exception and convert all the arguments to the final position.

Simplest transformation would be:

size_t position( size_t idx_r, size_t idx_l, size_t idx_rho, size_t idx_alpha, size_t idx_beta )
{
   size_t pos = (((((((idx_r * dim_l) + idx_l) * dim_rho) + idx_rho) * dim_alpha) + idx_alpha) * dim_beta) + idx_beta;
   return pos;
}

Where *dim_XXX* represent the size of the matrix in dimension XXX.

If you are performing many operations you may want to consider representing internally the data with a different order without changing the interface to get a better cache hit rate.

The general algorithm is converting each index in one dimension to the following dimension element and adding the offset at that dimension. Easiest example is with a 2 dimension system. To access row 3 column 2 on a 10 column array (assuming you store by rows, and counting from 1 on for the sake of argument) you will first calculate the start of the third row, that is 3 * 10 elements per row. Then you add the offset inside the row 2 (column 2).

If you grow it to a 3 dimensional array, first you would have to find the plane you are looking for by multiplying the plane index by the size of the plane and then use the previous algorithm:

size_t position( size_t x, size_t y, size_t z )
{
   size_t start_of_plane_z = z * dim_y * dim_x; // z * size_of_plane
   size_t start_of_row_y = y * dim_x; // y * size_of_row
   size_t offset_inside_row = x;

   return start_of_plane_z + start_of_row_y + offset_inside_row;
}

Now, applying some basic algebra you can turn the equation into:

size_t pos = (((z * dim_y) + y) * dim_x) + x;

That will reduce the number of multiplications and be easier to define for a higher number of dimensions.

David Rodríguez - dribeas
+7  A: 

Not sure exactly what you are trying to do here, but have you considered the functions ind2sub and sub2ind? They may help. You may need to worry about 0 versus 1-based indices as MATLAB is 1-based.

--Loren

Loren
+1 for being "the" Loren from Mathworks (I assume?)... welcome to SO....
Jason S
Yes, welcome! It's great to see more MathWorkers showing up here... although that means more stiff competition for answering questions. =)
gnovice
+1  A: 

Although you can certainly do the math yourself to compute the linear index (as pointed out by Tal), it would be cleaner and easier to read using the built-in function SUB2IND (as pointed out by Loren). For your example, you would use it in the following way:

index = sub2ind([8 12 3 4 3], idx_r, idx_l, idx_rho, idx_alpha, idx_beta);

If all of your indices are 0-based, you would have to add 1 to each of them before passing them to SUB2IND.

EDIT:

If you would like to see how to correctly compute linear indices by yourself in MATLAB, such that they agree with the results from SUB2IND, here's the code you would use:

index = (((idx_beta*4 + idx_alpha)*3 + idx_rho)*12 + idx_l)*8 + idx_r + 1;

NOTE that the indices that need to be used with this equation have to be 0-based, while the indices passed to SUB2IND have to be 1-based. To generalize this equation to an arbitrary number of dimensions N:

index = (...(idx(N)*sz(N-1) + idx(N-1))*sz(N-2) + ...)*sz(1) + idx(1) + 1;

or more succinctly:

index = 1 + sum(idx.*cumsum([1 sz(1:(N-1))]));

where idx is an array of 0-based index values for each dimension and sz is an array of the sizes of each dimension.

gnovice
Is there a similar function in any of the C++ libraries that doesn't make use of mex functions that does similar things? I'm having trouble finding it if it exists.
stanigator
I haven't come across a C++ equivalent of SUB2IND. Are you needing to do your indexing in a mex file you are writing? I know there is a function called "mexCallMATLAB" (http://www.mathworks.com/support/tech-notes/1600/1605.html#example5) that allows you to call MATLAB functions from a mex file, but this may only work in C.
gnovice
Currently, I'm running the C++ code as a mex file, but eventually I won't be, hence the question if there is a C++ equivalent other than the accepted answer.
stanigator
In that case, I'd just go with Tal's answer. Writing your own little function based on the formula he gave is probably the fastest and easiest thing for you to do.
gnovice
I just checked with the sub2ind function using the same indices in Matlab (except they are offset by 1 in C++ as array indices start at 0), and the computed index appears different from Tal's answer. Could the offset be the reason why I'm getting this discrepancy?
stanigator
Tal's answer is actually backwards for what you would have to do in MATLAB. I'll update my answer to show you the way that should agree with the results from SUB2IND.
gnovice
I have discovered some discrepancies that I have with using sub2ind and using the subscripts directly. i.e. template{sub2ind([8 12 3 4 3], 1, 10, 1, 1, 1} != template{1, 10, 1, 1, 1}. Does anyone know why that is the case, or should this go on a separate question?
stanigator
I just performed this test myself and it appears to work (i.e. "tf" equals 1, so they index the same element): size_t=[8 12 3 4 3]; template=num2cell(rand(size_t)); index={1 10 1 1 1}; tf=isequal(template{sub2ind(size_t,index{:})},template{index{:}}); *However*, if I instead do the following, it no longer works because "index" is already assumed to be an array of linear indices, not a subscript array: index=[1 10 1 1 1]; tf=isequal(template{sub2ind(size_t,index)},template{index}); In this case, "template{...}" returns a comma-separated list of values and isequal checks them all for equality.
gnovice