views:

614

answers:

2

I'm encountering a very strange problem concerning what appears to be a corrupt index of some kind. Not corrupt in the sense that dbcc checkdb will pick it up, but corrupt in the sense that it has rows that it shouldn't have.

I have two tables, TableA and TableB. For the purposes of my application, some rows are considered functionally duplicate, meaning while not all the column values are the same, the row is treated as a dup by my app. To filter these out, I created a view, called vTableAUnique. The view is defined as follows:

SELECT a.*   
FROM TableA a   
INNER JOIN  
(   
      SELECT ID, ROW_NUMBER() OVER   
          (PARTITION By Col1   
           ORDER BY Col1) AS Num   
      FROM TableA   
) numbered ON numbered.ID = a.ID   
WHERE numbered.Num = 1

The results of the view is all the records from TableA that don't have any other rows in TableA with the same values for Col1. For this example, let's say that TableA has 10 total rows, but only 7 with distinct values that show up in vTableAUnique.

TableB is basically just a list of values that match the values of Col1 from TableA. In this case, let's say that TableB has all 8 unique values that appear in vTableAUnique. So the data from TableA, TableB, and vTableAUnique would look like:

TableA (ID, Col1, Col2, Col3)

1,A,X,X

2,A,X,X

3,B,X,X

4,A,X,X

5,E,X,X

6,F,X,X

7,G,X,X

8,H,X,X

9,I,X,X

10,J,X,X


TableB (ID)

A

B

C

D

E

F

G

H

I

J


vTableAUnique (ID, Col1, Col2, Col3)

1,A,X,X

3,B,X,X

5,E,X,X

6,F,X,X

7,G,X,X

8,H,X,X

9,I,X,X

10,J,X,X

So here is the strange part. Sometimes when I join vTableAUnique with TableB on Col1, I get back the non-distinct values from TableA. In other words, rows that do NOT exist in vTableAUnique, but that do exist in TableA, appear when I do the join. If I do the select just off vTableAUnique, I don't get these rows. In this case, I would get back not just rows with the ids of 1,3,5,6,7,8,9,10, but ALSO rows with the ids of 2 and 4!

After banging my head against my desk, I decided to try and rebuild all the indexes in the DB. Sure enough, the problem disappeared. The same query now returned the correct rows. After an indererminant period of time, however, the problem comes back. DBCC CHECKDB doesn't show any issues, and I'm having a hard time tracking down which index might be causing this.

I'm using SQL Server 2008 Developer Edition on Vista x64.

HELP!

A: 

script out the indexes and look at the script, was it created with ALLOW_DUP_ROW? if so then that could be your problem

SQLMenace
+2  A: 
ROW_NUMBER() OVER (PARTITION By Col1 ORDER BY Col1)

is not a stable sort order, it can change from query to query depending on access path.

Your view may return different results being run several times.

Rebuilding indexes seems to affect the sort order.

Use this:

ROW_NUMBER() OVER (PARTITION By Col1 ORDER BY Id)

instead, it guarantees stable sort order.

Quassnoi
+1, Exactly what I was just typing ;)
LukeH