views:

539

answers:

9

I have a collection of objects in a database. Images in a photo gallery, products in a catalog, chapters in a book, etc. Each object is represented as a row. I want to be able to arbitrarily order these images, storing that ordering in the database so when I display the objects, they will be in the right order.

For example, let's say I'm writing a book, and each chapter is an object. I write my book, and put the chapters in the following order:

Introduction, Accessibility, Form vs. Function, Errors, Consistency, Conclusion, Index

It goes to the editor, and comes back with the following suggested order:

Introduction, Form, Function, Accessibility, Consistency, Errors, Conclusion, Index

How can I store this ordering in the database in a robust, efficient way?

I've had the following ideas, but I'm not thrilled with any of them:

  1. Array. Each row has an ordering ID, when order is changed (via a removal followed by an insertion), the order IDs are updated. This makes retrieval easy, since it's just ORDER BY, but it seems easy to break.

    // REMOVAL
    UPDATE ... SET orderingID=NULL WHERE orderingID=removedID
    UPDATE ... SET orderingID=orderingID-1 WHERE orderingID > removedID
    // INSERTION
    UPDATE ... SET orderingID=orderingID+1 WHERE orderingID > insertionID
    UPDATE ... SET orderID=insertionID WHERE ID=addedID

  2. Linked list. Each row has a column for the id of the next row in the ordering. Traversal seems costly here, though there may by some way to use ORDER BY that I'm not thinking of.

  3. Spaced array. Set the orderingID (as used in #1) to be large, so the first object is 100, the second is 200, etc. Then when an insertion happens, you just place it at (objectBefore + objectAfter)/2. Of course, this would need to be rebalanced occasionally, so you don't have things too close together (even with floats, you'd eventually run into rounding errors).

None of these seem particularly elegant to me. Does anyone have a better way to do it?

+1  A: 

The actsaslist mixin in Rails handles this basically the way you outlined in #1. It looks for an INTEGER column called position (of which you can override to name of course) and using that to do an ORDER BY. When you want to re-order things you update the positions. It has served me just fine every time I've used it.

As a side note, you can remove the need to always do re-positioning on INSERTS/DELETES by using sparse numbering -- kind of like basic back in the day... you can number your positions 10, 20, 30, etc. and if you need to insert something in between 10 and 20 you just insert it with a position of 15. Likewise when deleting you can just delete the row and leave the gap. You only need to do re-numbering when you actually change the order or if you try to do an insert and there is no appropriate gap to insert into.

Of course depending on your particular situation (e.g. whether you have the other rows already loaded into memory or not) it may or may not make sense to use the gap approach.

John
+2  A: 

I'd do a consecutive number, with a trigger on the table that "makes room" for a priority if it already exists.

Stu
This requires an O(n) restructuring on every insertion!
cdleary
+1  A: 

If the objects aren't heavily keyed by other tables, and the lists are short, deleting everything in the domain and just re-inserting the correct list is the easiest. But that's not practical if the lists are large and you have lots of constraints to slow down the delete. I think your first method is really the cleanest. If you run it in a transaction you can be sure nothing odd happens while you're in the middle of the update to screw up the order.

Eric Z Beard
+1  A: 

I did this in my last project, but it was for a table that only occasionally needed to be specifically ordered, and wasn't accessed too often. I think the spaced array would be the best option, because it reordering would be cheapest in the average case, just involving a change to one value and a query on two).

Also, I would imagine ORDER BY would be pretty heavily optimized by database vendors, so leveraging that function would be advantageous for performance as opposed to the linked list implementation.

pbh101
+2  A: 

An other alternative would be (if your RDBMS supports it) to use columns of type array. While this breaks the normalization rules, it can be useful in situations like this. One database which I know about that has arrays is PostgreSQL.

Cd-MaN
+1  A: 

Just a thought considering option #1 vs #3: doesn't the spaced array option (#3) only postpone the problem of the normal array (#1)? Whatever algorithm you choose, either it's broken, and you'll run into problems with #3 later, or it works, and then #1 should work just as well.

onnodb
+2  A: 

Use a floating point number to represent the position of each item:

Item 1 -> 0.0

Item 2 -> 1.0

Item 3 -> 2.0

Item 4 -> 3.0

You can place any item between any other two items by simple bisection:

Item 1 -> 0.0

Item 4 -> 0.5

Item 2 -> 1.0

Item 3 -> 2.0

(Moved item 4 between items 1 and 2).

The bisection process can continue almost indefinitely due to the way floating point numbers are encoded in a computer system.

Item 4 -> 0.5

Item 1 -> 0.75

Item 2 -> 1.0

Item 3 -> 2.0

(Move item 1 to the position just after Item 4)

Seun Osewa
This /will not/ continue indefinitely. For floating point numbers (doubles) values will converge after 53 rounds, in the pathological case. Even if your DBMS uses arbitrary precision decimals you will have a lot of data structure bloat.
cdleary
+1  A: 

I had this problem as well. I was under heavy time pressure (aren't we all) and I went with option #1, and only updated rows that changed.

If you swap item 1 with item 10, just do two updates to update the order numbers of item 1 and item 10. I know it is algorithmically simple, and it is O(n) worst case, but that worst case is when you have a total permutation of the list. How often is that going to happen? That's for you to answer.

moffdub
+1  A: 

Since I've mostly run into this with Django, I've found this solution to be the most workable. It seems that there isn't any "right way" to do this in a relational database.

tghw