ansaurus

Question

2D Data Structure

Answer 1

A:

You can use a hash table and insert (i,j) -> node where (i,j) is a 2-tuple containing 2 integers. You can write your own custom class which defines Equals method and a GetHash() method for that ... or Python gives it to you free of charge.

Now ... what exactly do you mean - sortable by a row or a column? Give an example with values please!

Hamish Grubijan 2010-01-03 17:11:41

I thought of a hash table, but then the sorting becomes very troublesome.

Vanwaril 2010-01-03 17:22:40

Nope, not very troublesome at all, but it will take O(m*n).

Hamish Grubijan 2010-01-03 17:28:27

You need to look at the nth row or column, extract it as a list, sort it, deduce the permutation list e.g. (1,2,5,4,3,6) showing where the indices should go, an then apply this order to ALL elements inside the dictionary. This is a fun problem, and Python implementation can be quite concise.

Hamish Grubijan 2010-01-03 17:33:28

Well, what you're saying then is to use a hash table instead of my vector, but that makes no difference, because the vector is still amortized constant time insertion. the node structure will still have to have 4 pointers, and the sorting and insertion logic will still have to deal with those.

Vanwaril 2010-01-03 17:35:51

Yes. The way I'm doing it is through the indexing vectors -- I'm sorting the index based on the values of the permutation list. My implementation is in C++ and pretty long though, heh :)

Vanwaril 2010-01-03 17:37:34

I am not a C++ guru. My feeling is that sometimes the complexity of C++ gets in the way. I like simpler implementations because they are less prone to bugs. Python + dict allows me to do exactly that, IMO. Also, how fast is your insertion / deletion of a row? Finally, if you are using c++, then add it as a tag.

Hamish Grubijan 2010-01-03 17:48:05

Insertion and deletion of a row in O(m+n). It goes across the header to the row/column number and inserts, changing links on either side. Similarly for delete, connecting links.

Vanwaril 2010-01-24 16:29:01

Answer 2

A:

Perhaps by creating a small database for it?

Databases sorting algorithms probably are better than reinventing the wheel. MySql would do. In order to gain performance, table can be created in memory. Then you can index on columns as a usual table, and let the database engine do the dirty job (ordering and such). And then you just harvest the results.

oscar 2010-01-03 18:40:49

Answer 3

+3 A:

I would create two index arrays, one for the columns, and one for the rows. So for your data

1 100 25 34
2 20  15 16
3 165 1  27

You create two arrays:

cols = [0, 1, 2, 3]
rows = [0, 1, 2]

Then when you want to sort the matrix by the 3rd row, you keep the original matrix intact, but just change the indices array accordingly:

cols = [2, 0, 3, 1]
rows = [0, 1, 2]

The trick now is to access your matrix with one indirection. So instead of accessing it with m[x][y] you access it by m[cols[x]][rows[y]]. You also have to use m[cols[x]][rows[y]] when you perform the reordering of the rows/cols array.

This way sorting is O(n*log(n)), and access is O(1).

For the data structure, I would use an array with links to another array:

+-+
|0| -> [0 1 2 3 4]
|1| -> [0 1 2 3 4]
|2| -> [0 1 2 3 4]
+-+

To insert a row, just insert it at the last position and update the the rows index array accordingly, with the correct position. E.g. when rows was [0, 1, 2] and you want to insert it at the front, rows will become [3, 0, 1, 2]. This way insertion of a row is O(n).

To insert a column, you also add it as the last element, and update cols accordingly. Inserting a column is O(m), row is O(n).

Deletion is also O(n) or O(m), here you just replace the column/row you want to delete with the last one, and then remove the index from the index array.

martinus 2010-01-24 16:38:21

But then insertion and deletion of an element are O(m*n), and of a row, O(m^2*n)..

Vanwaril 2010-01-25 05:55:32

Hi Vanwaril, I have update the entry, I think insertion and deletion can also be done in O(n) or O(m).

martinus 2010-01-25 08:32:16

I never thought of swapping for delete... thanks!

Vanwaril 2010-01-25 14:50:22

Answer 4

A:

If I were handed this problem, I'd create row and column remapping vectors. E.G. to sort rows, I'd determine row order as normal, but instead of copying rows, I'd just change the row remapping vector.

It would look something like this:

// These need to be set up elsewhere.
size_t nRows, nCols;
std::vector<T> data;

// Remapping vectors.  Initially a straight-through mapping.
std::vector<size_t> rowMapping(nRows), colMapping(nCols);
for(size_t y = 0; y < nRows; ++y)
    rowMapping[y] = y;
for(size_t x = 0; x < nCols; ++x)
    colMapping[x] = x;

// Then you read data(row, col) with
T value = data[rowMapping[row] * nCols + colMapping[col]];

P.S. a small optimization would be to store pointers in rowMapping instead of indices. This would let you do T value = rowMapping[row][colMapping[col]];, however, you would have to recalculate the pointers every time that the dimensions of data changes, which could be error-prone.

Mike DeSimone 2010-01-24 16:40:48

Again, the problem with this is though access and sorting are fast, insertion and deletion are not.

Vanwaril 2010-01-25 06:00:41

Insertion and deletion are O(n) if you pre-allocate rows and columns. Also, fast insertion and deletion was not specified as a requirement.

Mike DeSimone 2010-01-25 12:16:27

Answer 5

+1 A:

Just to add to martinus and Mike's answers: what you need is, in essence, pivoting, which is what they suggest and a very well known technique used in pretty much any numeric algorithm involving matrices. For example, you can run a quick search for "LU decomposition with partial pivoting" and "LU decomposition with full pivoting". The additional vectors that store the permutations are called the "pivots".

d. 2010-01-24 17:06:32

ansaurus

tags:

views:

answers:

2D Data Structure

related questions