ansaurus

Question

Grid data structure

Answer 1

+1 A:

You could have a two dimensional "linked matrix" (I forget the proper terminology):

... Col 3 ... Col 4 ...
      |         |
... --X-- ... --Y-- ...
      |         |
...  ...  ...  ...  ...

Each cell has four neighbours, as shown. Additionally you need row and column headers that might indicate the row/column position, as well as pointing to the first cell in each row or column. These are most easily represented as special cells without an up neighbour (for column headers).

Inserting a new column between 3 and 4 means iterating down the cells X in col 3, and inserting a new right neighbour Z. This new cell Z links leftward to X and rightward to Y. You also need to add a new column header, and link the new cells vertically. Then the positions of all the columns after 4 can be renumbered (col 4 becomes col 5).

... Col 3 Col 4 Col 5 ...
      |     |     |
... --X-----Z-----Y-- ...
      |     |     |
...  ...   ...   ...  ...

The cost of inserting a column is O(n) for inserting and linking the new cells, and again O(m) for updating the column headers. It's a similar process for deletion.

Because each cell is just four links, the same algorithms are used for row insertion/deletion.

Edmund 2010-08-23 02:40:40

thank you for your response,but it seems you didn't understand my question,by your answer you've made row operations much expensive as columns!, which is the opposite of what I need.in short, column deletion is still very slow because, still i have to go through all those cells and re-link again.it's not as easy and cheap as deleting a row.in column insertion, the column will be generated 'already' as column(think of replacing columns).however, i used 'deleting' for simplicity.

KA100 2010-08-23 03:47:48

Removing a column in my scheme has the same algorithmic cost as removing a row. If you have many many more rows than columns, then there could be a difference in absolute cost -- but this cost is still linear per cell. I don't think you can do better than that, except by deferring some of the work (e.g. just mark a column as deleted, then bulk delete all marked columns when saving).

Edmund 2010-08-23 04:17:36

I'm *not* talking about navigation,I need better structure for *Manipulation* not for navigation. actually 'list of lists" structure is better enough for navigation than adding more pointers each cell.again, what I'm looking for is better solution for effective data manipulation.thank you.

KA100 2010-08-23 23:48:25

This structure is not just lists of lists -- it's a 2-dimensionally linked *matrix*. Perhaps you could clarify the problem in your original question: you say "I need to ‘loop’ all rows to remove that indexes cells". If each row is an ordinary list, you need to iterate through each one just to find the cell to remove from that row; and then repeat for all other rows. Is that what you mean? That is not a problem with the structure I'm describing, because you can quickly step through all the cells in the column without having to go through the lists for each row.

Edmund 2010-08-24 02:37:43

sorry if my question wasn't clear, maybe I should say"I want O(1) deletion/insertion for both rows and columns" ---with rows it is, but why not columns also?

KA100 2010-08-24 03:01:58

Answer 2

A:

Keep your existing data structure as is. In addition, give each column a unique id when it is created. When you delete the column, just add its id to a hash table of all deleted column ids. Every time you walk a row, check each element's column id (which needs to be stored along with all other data for an element) against the hash table and splice it out of the row if it has been deleted.

The hash table and ids are unnecessary if you have a per-column data structure that each grid element can point to. Then you just need a deleted bit in that data structure.

By the way, Edmund's scheme would be fine for you as well. Even though it takes O(n) time to delete a row or column of length n, you can presumably amortize that cost against the cost of creating those n elements, making the delete O(1) amortized time.

Keith Randall 2010-08-24 05:09:35

Answer 3

A:

I know that "Linked-Lists" are usually appreciated from a theoretical point of view, but in practice they are generally inefficient.

I would suggest moving toward Random Access Containers to get some speed. The most simple would be an array, but a double-ended queue or an indexed skip list / B* tree could be better, depending on the data size we are talking about.

Conceptually, it doesn't change much (yet), however you get the ability to move to a given index in O(1) (array, deque) / O(log N) (skip list / B* tree) operations, rather than O(N) with a simple linked-list.

And then it's time for magic.

Keith has already exposed the basic idea: rather than actually deleting the column, you just need to mark it as deleted and then 'jump' above it when you walk your structure. However a hash table requires a linear walk to get to the Nth column. Using a Fenwick Tree would yield an efficient way to compute the real index, and you could then jump directly there.

Note that a key benefit of marking a row as deleted is the obvious possibility of an undo operation.

Also note that you might want to build a compacting function, to eliminate the deleted columns from time to time, and not let them accumulate.

Matthieu M. 2010-08-24 07:05:13

ansaurus

tags:

views:

answers:

Grid data structure

related questions