views:

308

answers:

7

I need to make a design decision about database. The requirment is that one database table has an *AUTO_INCREMENT PRIMARY KEY* field called id. By default, each row is shown to user (in web) sorted ascendenting by id. For example, if there are 4 records in the table. The UI will show rows in sequence of 0, 1, 2, 3.

Now, there is requirement that user can drag & drop row in UI to change sequence. Say, user drag rom 3 and drop it befow 0. So, the display sequence turns to be 3, 0, 1, 2. This sequence should be persistent into database.

I'm wondering how to design database table to make this persistent and scalable. My first thought is that each row has a "sequence" field indicating the display sequence. By default, the value should be same as id. When selecting data from database for display, the rows are sorted ascending on sequence instead of id.

If sequence is changed, then it is updated to new value. The result is that it may involves a lot of changes in other rows. Taking example above, originally the table is like this:

|id   | sequence |
|0    | 0        |
|1    | 1        |
|2    | 2        |
|3    | 3        |

Now, after drag row with id 3 to first. Its sequence is updated to 0. At the same time, row with id 0, 1, 2 should also be updated.

|id   | sequence |
|0    | 1        |
|1    | 2        |
|2    | 3        |
|3    | 0        |

I'm afraid this approach will make re-sequence cost a lot resource and not scalable. So, I suppose the sequence can be intiialized by multiplying id with K(say, 10). This leaves gaps between sequence value for insertion. However, the gap can still consumed up if K+1 rows are moved to this gap.

|id   | sequence |
|0    | 0        |
|1    | 10       |
|2    | 20       |
|3    | 30       |

This seems a common problem to database design. Anybody has better idea to achive this?

+2  A: 

How about linked lists? :-)

CREATE TABLE item(
    id INT PRIMARY KEY,
    prev INT,
    next INT
);

WITH RECURSIVE sequence AS (
    SELECT item.id, item.prev, item.next FROM item
    WHERE item.prev IS NULL
  UNION
    SELECT item.id, item.prev, item.next FROM sequence
    INNER JOIN item ON sequence.next = item.id
)
SELECT * FROM sequence;

Actually, I don't have a PostgreSQL on hand to test if this actually works (and MySQL doesn't support SQL-99's WITH RECURSIVE), and I don't seriously recommend it either.

ephemient
If it wasn't limited to MySQL I would say this is the best solution.
Kev
LOL really? O(1) updates and insertion, but running this particular query is O(n).
ephemient
A: 

--Edit: Just for others reading this post, I've been downvoted out of some weird personal grudge, not general inaccuracy in relation to this topic, give that consideration when reading, and vote how you wish! :)

-- Old:

The typical way to do this is with your 'Sequence' number (I call it 'SortOrder').

It's really quite easy.

"Scalable" doesn't come into it; because you've already got a list of all the nodes you're working on (drag and drop, as you say) so what you're really doing is just swapping those numbers.

Trivial.

Noon Silk
If you drag an item from the end to the beginning, wouldn't you expect that everything moves down one slot instead of simply swapping the two rows?
ephemient
Eph: Sure; but how many items are being displayed? I don't think it's practical that it would be an issue.
Noon Silk
Hahaha. Ah cletus. You are amusing.
Noon Silk
It's odd to brush this off as trivial and assume that there won't be much data. Obviously the OP thinks there'd be a performance issue, and they probably have good reason. How can you say "scalable doesn't come into it"? How do you know that there aren't 100,000 rows in this table? And even if there are only 20, what if this operation is being performed on the server 50 times per second by various users?
Joshua Carmody
+2  A: 

The ID and Sequence/SortOrder are separate and should not depend on each other at all.

for a move-up/move-down feature: You can swapping Sequence/SortOrder Values

or

For a drag and drop feature:

1) Establish the new Sequence/OrderNumber for the selected record.

2) Get the selected records current sequence, then Update the selected record with the new number.

3) a) If the New Sequence number is below the current sequence number increment all the sequence numbers for the records that have a sequence number >= the new sequence number (exlcuding the selected one)

b) if the New sequence number is above the current sequence number, decrement all the sequence numbers below the new selected one and above the current one.

Hope this makes sense and I have though it out the right way (below is the actual implementation).

I have implemented this in a single SQL statement that has some small amount of logic, not for the purists, but its works well.

Here is an example (OP: you will want to change the GUID IDs to INTs):

CREATE PROCEDURE [proc_UpdateCountryRowOrder]
    @ID UNIQUEIDENTIFIER,
    @NewPosition INT
AS

SET NOCOUNT ON

DECLARE @CurrentPosition INT
DECLARE @MaximumPosition INT

IF (@NewPosition < 1) SET @NewPosition = 1

SELECT @CurrentPosition = [Countries].[Order]
FROM [Countries]
WHERE [Countries].[ID] = @ID

SELECT @MaximumPosition = MAX([Countries].[Order])
FROM [Countries]

IF (@NewPosition > @MaximumPosition) SET @NewPosition = @MaximumPosition

IF (@NewPosition <> @CurrentPosition)
BEGIN
    IF (@NewPosition < @CurrentPosition)
    BEGIN
     BEGIN TRAN

     UPDATE [Countries]
     SET [Countries].[Order] = [Countries].[Order] + 1
     WHERE [Countries].[Order] >= @NewPosition
     AND [Countries].[Order] < @CurrentPosition

     UPDATE [Countries]
     SET [Countries].[Order] = @NewPosition
     WHERE ID = @ID

     COMMIT TRAN
    END
    ELSE
    BEGIN
     BEGIN TRAN

     UPDATE [Countries]
     SET [Countries].[Order] = [Countries].[Order] - 1
     WHERE [Countries].[Order] <= @NewPosition
     AND [Countries].[Order] > @CurrentPosition

     UPDATE [Countries]
     SET [Countries].[Order] = @NewPosition
     WHERE ID = @ID

     COMMIT TRAN
    END
END
GO
Mark Redman
This approach can update quite a few records for single drag-and-drop. will it be costy?
Morgan Cheng
I agree, all records need to be updated which would be costly but I think would be required to definitively assign a sort order. Just had a thought: What if the sorting was extracted from the data into another table and JOINin the two. This way only the "SortOrder" table would be updated? Any comments on that?
Mark Redman
if you move item 902 to position 5, then you're going to have to update the old 5-901 along with 902. I don't think you can avoid the cost if you want to be able to do this persistently
warren
The only way to know if it will be too costly is to create some test data of the magnitude you can expect in real life, then try it out.
Kev
A: 

I have done this by returning a CSV string of the ids (db keys) in the order selected by the user to the server. I have a function on my db which will convert a csv string into a table with 2 fields - the id and a sequence (which is actually an int with an identity). The sequence field values in this temp table reflect the order of the items in the CSV string. I then update the data table with the new sequence field value, matching with the ids.

Edit: the bounty points looked so tasty, I thought I would provide some details on my answer. Here is the code:

declare @id_array varchar(1000)
set @id_array = '47,32,176,12,482'

declare @id_list_table table ([id] int, [sequence] int)

insert @id_list_table ([id], [sequence])
  select [id], [sequence]
  from get_id_table_from_list (@id_array)

update date_table
  set [sequence] = id_list.[sequence]
  from date_table
    inner join @id_list_table as id_list
      on (id_list.[id] = date_table.[id])

I set up @id_array as a variable for testing - typically your UI would get the id values in their revised order and pass it as a parameter to a stored proc. The get_id_table_from_list function parses a csv string into a table with two 'int' columns: [id] and [sequence]. The [sequence] column is an identity. The results of that function working on my test data look like this:

    id   seq
    47   1  
    32   2  
    176  3  
    12   4  
    482  5  

You will need a function that will parse the csv (I can post mine if you are interested, and I have seen others posted here and there). Note that my code assumes you are using sql server - the sequence thing depends on an identity field, and the update query uses a T-SQL extension (the 'from' clause) - if you are using some other db, you would need to make some simple changes.

Ray
+5  A: 

The obvious answer to me is to use the last solution you mentioned but with decimals (floats).

So you start with, say: {0.1, 0.2, 0.3, 0.4, 0.5}. If you move the last item to between 0.2 and 0.3 it becomes 0.25. If you move it to the top it becomes 0.05. Each time you simply take the mid-point of the two numbers on either side. In other words, the average of the previous/next items.

Another similar solution is to use characters, then sort by strings alphabetically. Starting with {1, 2, 3, 4, 5}, if you move the 5 between 2 and 3 you'd use 25. If you do a string sort of the list you keep the right order: {1, 2, 25, 3, 4}.

The only problem I can think of with these methods is that eventually, you will hit the limit of floating point precision, i.e. trying to find a number between 0.0078125 and 0.0078124. A few ways to solve this:

  • Run a script every so often that runs through every item and reorders them to {0.1, 0.2, 0.3, ...}.
  • Don't use two decimal places when you can use one. Between 0.2 and 0.25 you could use 0.23 instead of the calculated 0.225.
  • Re-sequence locally, not globally. If you have {0.2, 0.3, 0.6} and want to insert after 0.2, you could set the second one to 0.4 and insert the new item at 0.3.
DisgruntledGoat
Nice answer! I was going to suggest sequence numbers with a fixed interval initially (say (10,20,30,40) so that you could easily re-sequence a row, but the "float" solution is better as you can always find a value between the before and after rows.
James Anderson
Not "always"... you'll eventually hit the limit of floating-point precision. But that will probably take a while :)
ephemient
Yes, that's what the last part of the answer is about. At some point, you'll have to re-sequence manually. I like the idea of re-sequencing small groups of numbers where applicable.
DisgruntledGoat
+1  A: 

I answered a similar question here : http://stackoverflow.com/questions/339743/visually-order-large-data-sets/339867#339867

If you move many items then you have to loop and move each item and check for overflow also. But all in all, the basic logic is to have a sorting column with gaps that can be reinitialized periodically.

sindre j
A: 

This fix is easier than you think. One temp table and one update query and your done.


CREATE TABLE #TempData
(
NewSequence bigint identity(1,1),
[Id] BigInt
)

INSERT INTO #TempData ([Id])
SELECT [Id] 
FROM TableNameGoesHere
ORDER BY Sequence

UPDATE TableNameGoesHere
SET Sequence = t2.NewSequence
FROM TableNameGoesHere t1
INNER JOIN #TempData t2
ON t1.[Id] = t2.[Id]

DROP TABLE #TempData
Cape Cod Gunny