ansaurus

Question

Answer 1

+5 A:

If you are using SQL Server 2005+ or higher, you can use a ranking function:

Select tblA.Id, tblA.Name
    , Row_Number() Over ( Partition By tblB.[Group] Order By tblA.Id ) As Sequence
    , tblA.tblBID
From tblA
    Join tblB
        On tblB.tblBID = tblB.ID

Row_Number ranking function.

Here's another solution that would work in SQL Server 2000 and prior.

Select A.Id, A.Name
    , (Select Count(*)
        From tblB As B1
        Where B1.[Group] = B.[Group]
            And B1.Id < B.ID) + 1 As Sequence
    , A.tblBID
From tblA As A
    Join tblB As B
        On B.Id = A.tblBID

EDIT

Also want to make it clear that I want to actually update tblA to reflect the proper sequences.

In SQL Server, you can use their proprietary From clause in an Update statement like so:

Update tblA
Set Sequence =  (
                Select Count(*)
                From tblB As B1
                Where B1.[Group] = B.[Group]
                    And B1.Id < B.ID
                ) + 1
From tblA As A
    Join tblB As B
        On B.Id = A.tblBID

The Hoyle ANSI solution might be something like:

Update tblA
Set Sequence = (
                Select (Select Count(*)
                        From tblB As B1
                        Where B1.[Group] = B.[Group]
                            And B1.Id < B.ID) + 1
                From tblA As A
                    Join tblB As B
                        On B.Id = A.tblBID
                Where A.Id = tblA.Id
                )

EDIT

Can we do that [the inner group] comparison based on A.Sequence instead of B.ID?

Select A1.*
    , (Select Count(*)
        From tblB As B2
            Join tblA As A2
                On A2.tblBID = B2.Id
        Where B2.[Group] = B1.[Group]
            And A2.Sequence < A1.Sequence) + 1
From tblA As A1
    Join tblB As B1
        On B1.Id = A1.tblBID

Thomas 2010-08-20 18:09:40

I need it to work on sql server 2000, so the first solution won't work. Maybe I should be more clear, they need to have the same sequence order as in the original tblA if at all possible. I am not sure why you are comparing B IDs as they may or may not be sequential, even though my example is.

CaptnCraig 2010-08-20 21:18:03

@CapnCraig - I'm not comparing tblB.IDs other than to themselves. Thus, it is OK that they have gaps. In short, we need a way of determining the order within a group. In my second example, I am simply saying that the lowest tblB.ID will get a ranking of 1. The next lowest will get a ranking of 2 and so on.

Thomas 2010-08-20 21:22:17

@CapnCraig - Updated my post to illustrate the Update syntax.

Thomas 2010-08-20 21:36:27

Can we do that comparison based on A.Sequence instead of B.ID?

CaptnCraig 2010-08-20 21:41:01

@CaptnCraig - You mean do the inner grouping sort on A.Sequence?

Thomas 2010-08-20 21:55:52

@CaptnCraig - Updated post to do inner group ordering on A.Sequence. I'll leave it to you to convert that into an Update statement.

Thomas 2010-08-20 22:07:11

Answer 2

A:

Something like:

SELECT a.id, a.name, row_number() over (partition by b.group order by a.id)
FROM tblA a
  JOIN tblB on a.tblBID = b.ID;

a_horse_with_no_name 2010-08-20 18:09:44

Answer 3

+2 A:

Because it's SQL 2000, we can't use a windowing function. That's okay.

Thomas's queries are good and will work. However, they will get worse and worse as the number of rows increases—with different characteristics depending on how wide (the number of groups) and how deep (the number of items per group). This is because those queries use a partial cross-join, perhaps we could call it a "pyramidal cross-join" where the crossing part is limited to right side values less than left side values rather than left crossing to all right values.

What to do?

I think you will be surprised to find that the following long and painful-looking script will outperform the pyramidal join at a certain size of data (which may not be all that big) and eventually, with really large data sets must be considered a screaming performer:

CREATE TABLE #tblA (
   ID int identity(1,1) NOT NULL,
   Name varchar(1) NOT NULL,
   Sequence int NOT NULL,
   tblBID int NOT NULL,
   PRIMARY KEY CLUSTERED (ID)
)

INSERT #tblA VALUES ('a', 5, 14)
INSERT #tblA VALUES ('b', 3, 15)
INSERT #tblA VALUES ('c', 3, 16)
INSERT #tblA VALUES ('d', 3, 17)

CREATE TABLE #tblB (
   ID int NOT NULL PRIMARY KEY CLUSTERED,
   GroupID int NOT NULL
)
INSERT #tblB VALUES (14, 1)
INSERT #tblB VALUES (15, 1)
INSERT #tblB VALUES (16, 2)
INSERT #tblB VALUES (17, 3)

CREATE TABLE #seq (
   seq int identity(1,1) NOT NULL,
   ID int NOT NULL,
   GroupID int NOT NULL,
   PRIMARY KEY CLUSTERED (ID)
)

INSERT #seq
SELECT
   A.ID,
   B.GroupID
FROM
   #tblA A
   INNER JOIN #tblB B ON A.tblBID = b.ID
ORDER BY B.GroupID, A.Sequence

UPDATE A
SET A.Sequence = S.seq - X.MinSeq + 1
FROM
   #tblA A
   INNER JOIN #seq S ON A.ID = S.ID
   INNER JOIN (
      SELECT GroupID, MinSeq = Min(seq)
      FROM #seq
      GROUP BY GroupID
   ) X ON S.GroupID = X.GroupID

SELECT * FROM #tblA

DROP TABLE #seq
DROP TABLE #tblB
DROP TABLE #tblA

If I understood you correctly, then ORDER BY B.GroupID, A.Sequence is correct. If not, you can switch A.Sequence to B.ID.

Also, my index on the temp table should be experimented with. For a certain quantity of rows, and also the width and depth characteristics of those rows, clustering on one of the other two columns in the #seq table could be helpful.

Last, there is a possible different data organization possible: leaving GroupID out of the #seq table and joining again. I suspect it would be worse, but am not 100% sure.

Emtucifor 2010-08-21 04:17:35

ansaurus

tags:

views:

answers:

SQL to resequence items by groups

related questions