views:

74

answers:

3

Usually ‘expandable’ grids are represented as list of lists (list of rows, each row has list of cells) and those lists are some kind of linked lists.

Manipulating (removing, inserting) rows in this data structure is easy and inexpensive, it’s just matter of re-linking previous nodes, but when it comes to columns, for instance removing a column it become a very long operation, I need to ‘loop’ all rows to remove that indexes cells. Clearly this isn’t good behavior, at least for my case.

I’m not talking databases here; a good example I’ve found for this is some text file into a text editor, (as I know) text editors mostly splitting lines into rows and it’s easy to remove line. I want removing a column is as inexpensive and efficient as removing some row.

Finally, what I need is some Multi-dimensional grid but I think of any 2d simple grid would be applicable for MD, Am I right?

+1  A: 

You could have a two dimensional "linked matrix" (I forget the proper terminology):

... Col 3 ... Col 4 ...
      |         |
... --X-- ... --Y-- ...
      |         |
...  ...  ...  ...  ...

Each cell has four neighbours, as shown. Additionally you need row and column headers that might indicate the row/column position, as well as pointing to the first cell in each row or column. These are most easily represented as special cells without an up neighbour (for column headers).

Inserting a new column between 3 and 4 means iterating down the cells X in col 3, and inserting a new right neighbour Z. This new cell Z links leftward to X and rightward to Y. You also need to add a new column header, and link the new cells vertically. Then the positions of all the columns after 4 can be renumbered (col 4 becomes col 5).

... Col 3 Col 4 Col 5 ...
      |     |     |
... --X-----Z-----Y-- ...
      |     |     |
...  ...   ...   ...  ...

The cost of inserting a column is O(n) for inserting and linking the new cells, and again O(m) for updating the column headers. It's a similar process for deletion.

Because each cell is just four links, the same algorithms are used for row insertion/deletion.

Edmund
thank you for your response,but it seems you didn't understand my question,by your answer you've made row operations much expensive as columns!, which is the opposite of what I need.in short, column deletion is still very slow because, still i have to go through all those cells and re-link again.it's not as easy and cheap as deleting a row.in column insertion, the column will be generated 'already' as column(think of replacing columns).however, i used 'deleting' for simplicity.
KA100
Removing a column in my scheme has the same algorithmic cost as removing a row. If you have many many more rows than columns, then there could be a difference in absolute cost -- but this cost is still linear per cell. I don't think you can do better than that, except by deferring some of the work (e.g. just mark a column as deleted, then bulk delete all marked columns when saving).
Edmund
I'm *not* talking about navigation,I need better structure for *Manipulation* not for navigation. actually 'list of lists" structure is better enough for navigation than adding more pointers each cell.again, what I'm looking for is better solution for effective data manipulation.thank you.
KA100
This structure is not just lists of lists -- it's a 2-dimensionally linked *matrix*. Perhaps you could clarify the problem in your original question: you say "I need to ‘loop’ all rows to remove that indexes cells". If each row is an ordinary list, you need to iterate through each one just to find the cell to remove from that row; and then repeat for all other rows. Is that what you mean? That is not a problem with the structure I'm describing, because you can quickly step through all the cells in the column without having to go through the lists for each row.
Edmund
sorry if my question wasn't clear, maybe I should say"I want O(1) deletion/insertion for both rows and columns" ---with rows it is, but why not columns also?
KA100
A: 

Keep your existing data structure as is. In addition, give each column a unique id when it is created. When you delete the column, just add its id to a hash table of all deleted column ids. Every time you walk a row, check each element's column id (which needs to be stored along with all other data for an element) against the hash table and splice it out of the row if it has been deleted.

The hash table and ids are unnecessary if you have a per-column data structure that each grid element can point to. Then you just need a deleted bit in that data structure.

By the way, Edmund's scheme would be fine for you as well. Even though it takes O(n) time to delete a row or column of length n, you can presumably amortize that cost against the cost of creating those n elements, making the delete O(1) amortized time.

Keith Randall
A: 

I know that "Linked-Lists" are usually appreciated from a theoretical point of view, but in practice they are generally inefficient.

I would suggest moving toward Random Access Containers to get some speed. The most simple would be an array, but a double-ended queue or an indexed skip list / B* tree could be better, depending on the data size we are talking about.

Conceptually, it doesn't change much (yet), however you get the ability to move to a given index in O(1) (array, deque) / O(log N) (skip list / B* tree) operations, rather than O(N) with a simple linked-list.

And then it's time for magic.

Keith has already exposed the basic idea: rather than actually deleting the column, you just need to mark it as deleted and then 'jump' above it when you walk your structure. However a hash table requires a linear walk to get to the Nth column. Using a Fenwick Tree would yield an efficient way to compute the real index, and you could then jump directly there.

Note that a key benefit of marking a row as deleted is the obvious possibility of an undo operation.

Also note that you might want to build a compacting function, to eliminate the deleted columns from time to time, and not let them accumulate.

Matthieu M.