views:

162

answers:

3

I want to convert a table storing in Name-Value pair data to relational form in SQL Server 2008.

Source table

Strings
ID  Type String
100 1 John
100 2 Milton
101 1 Johny
101 2 Gaddar

Target required

Customers
ID  FirstName LastName
100 John  Milton
101 Johny  Gaddar

I am following the strategy given below,

Populate the Customer table with ID values in Strings Table

INSERT INTO CUSTOMERS SELECT DISTINCT ID FROM Strings

You get the following

Customers
ID  FirstName LastName
100 NULL  NULL
101 NULL  NULL

Update Customers with the rest of the attributes by joining it to Strings using ID column. This way each record in Customers will have corresponding 2 matching records.

UPDATE Customers
    SET FirstName = (CASE WHEN S.Type=1 THEN S.String ELSE FirstName)
        LastName = (CASE WHEN S.Type=2 THEN S.String ELSE LastName)
FROM Customers
    INNER JOIN Strings ON Customers.ID=Strings.ID

An intermediate state will be llike,

ID  FirstName LastName ID Type String
100 John  NULL  100 1 John
100 NULL  Milton  100 2 Milton
101 Johny  NULL  101 1 Johny
101 NULL  Gaddar  101 2 Gaddar

But this is not working as expected. Because when assigning the values in the SET clause it is setting only the committed values instead of the uncommitted. Is there anyway to set uncommitted values (with in the processing time of query) in UPDATE statement?

PS: I am not looking for alternate solutions but make my approach work by telling SQL Server to use uncommitted data for UPDATE.

+1  A: 

The easiest way to do it would be to split the update into two:

UPDATE Customers
SET FirstName = Strings.String
FROM Customers
INNER JOIN Strings ON Customers.ID=Strings.ID AND Strings.Type = 1

And then:

UPDATE Customers
SET LastName = Strings.String
FROM Customers
INNER JOIN Strings ON Customers.ID=Strings.ID AND Strings.Type = 2

There are probably ways to do it in one query such as a derived table, but unless that's a specific requirement I'd just use this approach.

Evil Trout
This was the method I used earlier. I wanted to club this to save scans on Strings table as it was having millions of records.
Faiz
A: 

Have a look at this, it should avoid all the steps you had

DECLARE @Table TABLE(
     ID INT,
     Type INT,
     String VARCHAR(50)
)
INSERT INTO @Table (ID,[Type],String) SELECT 100 ,1   ,'John'
INSERT INTO @Table (ID,[Type],String) SELECT 100 ,2   ,'Milton'
INSERT INTO @Table (ID,[Type],String) SELECT 101 ,1   ,'Johny'
INSERT INTO @Table (ID,[Type],String) SELECT 101 ,2   ,'Gaddar'

SELECT  IDs.ID,
     tName.String NAME,
     tSur.String Surname
FROM    (
      SELECT DISTINCT ID
      FROM @Table
     ) IDs LEFT JOIN
     @Table tName ON IDs.ID = tName.ID AND tName.[Type] = 1  LEFT JOIN
     @Table tSur ON IDs.ID = tSur.ID AND tSur.[Type] = 2

OK, i do not think that you will find a solution to what you are looking for. From UPDATE (Transact-SQL) it states

Using UPDATE with the FROM Clause

The results of an UPDATE statement are undefined if the statement includes a FROM clause that is not specified in such a way that only one value is available for each column occurrence that is updated, that is if the UPDATE statement is not deterministic. For example, in the UPDATE statement in the following script, both rows in Table1 meet the qualifications of the FROM clause in the UPDATE statement; but it is undefined which row from Table1 is used to update the row in Table2.

USE AdventureWorks;
GO
IF OBJECT_ID ('dbo.Table1', 'U') IS NOT NULL
    DROP TABLE dbo.Table1;
GO
IF OBJECT_ID ('dbo.Table2', 'U') IS NOT NULL
    DROP TABLE dbo.Table2;
GO
CREATE TABLE dbo.Table1 
    (ColA int NOT NULL, ColB decimal(10,3) NOT NULL);
GO
CREATE TABLE dbo.Table2 
    (ColA int PRIMARY KEY NOT NULL, ColB decimal(10,3) NOT NULL);
GO
INSERT INTO dbo.Table1 VALUES(1, 10.0), (1, 20.0), (1, 0.0);
GO

UPDATE dbo.Table2 
SET dbo.Table2.ColB = dbo.Table2.ColB + dbo.Table1.ColB
FROM dbo.Table2 
    INNER JOIN dbo.Table1 
    ON (dbo.Table2.ColA = dbo.Table1.ColA);
GO
SELECT ColA, ColB 
FROM dbo.Table2;
astander
Thats again the same thing as answer 1. Instead of doing separate updates you are using multiple joins. This also will cause multiple scans on the Strings table which I want to avoid. And I am not looking for alternate solutions but make my approach work by telling SQL Server to use uncommitted data. Thanks anyway. Appreciate the effort :)
Faiz
Have you had a look at READUNCOMMITTED hint?
astander
I tried setting the ISOLATION LEVEL just above the statement to READ UNCOMMITTED but it is making no difference (SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;). I thought it will not have any effect with "in query changes"
Faiz
And Microsoft documentation [http://technet.microsoft.com/en-us/library/ms177523.aspx] says "NOLOCK and READUNCOMMITTED are not allowed"
Faiz
A: 

Astander is correct (I am accepting his answer). The update is not happening because of a read UNCOMMITTED issue but because of the multiple rows returned by the JOIN. I have verified this. UPDATE picks only the first row generated from the multiple records to update the original table. This is the behavior for MSSQL, Sybase and such RDMBMSs but Oracle does not allow this kind of an update an d it throws an error. I have verified this thing for MSSQL.

And again MSSQL does not support updating a cell with UNCOMMITTED data. Don't know the status with other RDBMSs. And I have no idea if anyRDBMS provides with in the query ISOLATION level management.

An alternate solution will be to do it in two steps, Aggregate to unpivot and then insert. This has lesser scans compared to methods given in above answers.

INSERT INTO Customers
SELECT 
    ID
    ,MAX(CASE WHEN Type = 1 THEN String ELSE NULL END) AS FirstName
    ,MAX(CASE WHEN Type = 2 THEN String ELSE NULL END) AS LastName
FROM Strings
GROUP BY ID

Thanks to my friend Roji Thomas for helping me with this.

Faiz