I have a table that defines a heirarchy:
Create Table [example] (
id Integer Not Null Primary Key,
parentID Integer Null,
largeData1 nVarChar(max) Null,
largeData2 nVarChar(max) Null);
-- largeData3...n also exist
Insert Into [example] (id, parentID, largeData1, largeData2)
Select 1, null, 'blah blah blah', null Union
Select 2, 1, null, null Union
Select 3, 1, 'foo bar foobar', null Union
Select 4, 3, null, 'lorem ipsum' Union
Select 5, 4, null, null;
Hierarchy diagram for this data:
I want to write a query that will return a single row for any given [id] value. The row should contain that row's [id] and [parentID] information. It should also contain the [largeData1...n] fields. However, if a largeData field is null, then it should traverse up the hierarchy until a non-null value for that field is encountered. It should, in short, function like the coalesce function, except across a hierarchy of rows instead of a set of columns.
Example:
Where [id] = 1:
id: 1
parentID: null
largeData1: blah blah blah
largeData2: null
Where [id] = 2
id: 1
parentID: 1
largeData1: blah blah blah
largeData2: null
Where [id] = 3
id: 3
parentID: 1
largeData1: foo bar foobar
largeData2: null
Where [id] = 4
id: 4
parentID: 3
largeData1: foo bar foobar
largeData2: lorem ipsum
Where [id] = 5
id: 5
parentID: 4
largeData1: foo bar foobar
largeData2: lorem ipsum
So far, I have this:
Declare @id Integer; Set @id = 5;
With heirarchy
(id, parentID, largeData1, largeData2, [level])
As (
Select id, parentID, largeData1,
largeData2, 1 As [level]
From example
Where id = @id
Union All
Select parent.id, parent.parentID,
parent.largeData1,
parent.largeData2,
child.[level] + 1 As [level]
From example As parent
Inner Join heirarchy As child
On parent.id = child.parentID)
Select id, parentID,
(Select top 1 largeData1
From heirarchy
Where largeData1 Is Not Null
Order By [level] Asc) As largeData1,
(Select top 1 largeData2
From heirarchy
Where largeData2 Is Not Null
Order By [level] Asc) As largeData2
From example
Where [id] = @id;
This returns the results that I am looking for. However, according to the query plan, it is making a separate pass through the hierarchy for each largeData field that I pull back.
How can I make this more efficient?
This is obviously a simplified version of a more complex problem. The final query will return data in XML format, so any solutions involving the FOR XML clause are perfectly fine.
I can create a CLR aggregate function for this, if doing so would help. I have not yet explored that route.