ansaurus

Question

Alternative to having an aggregate concatenation function

Answer 1

A:

I recommend adding an incrementing index to the table to use as a primary key. If you do not do so, I think SQL Server does so for you automatically behind the scenes, but I highly recommend explicitly defining the column. That way, you can refer to individual rows by the primary key instead of having to search for both cross-reference values.

Also, watch the clustering on the cross-reference table. Don't cluster on the index primary key if you add one. You probably want to cluster on both of the foreign keys, but you get to choose which foreign key one makes more sense. For example:

ALTER TABLE ReceiptPointIndexLocation 
    ADD CONSTRAINT UK_ReceiptPointIndexLocation_ReceiptPointId_IndexLocation
        UNIQUE CLUSTERED (ReceiptPointId, IndexLocation)

Alternatively, if the combination of (ReceiptPointId, IndexLocation) is unique enough, you can make that the primary key.

Paul Williams 2009-08-14 20:49:01

The only reason to explicitly define an identity column would be if the records were to be used as a foreign key in another table. Setting the ReceiptPointId and IndexLocation columns as the primary key would mean you couldn't have more than one combination in the table - that's not the case due to the use of the dates.

OMG Ponies 2009-08-14 21:04:35

I prefer dealing with single column primary keys and defining unique constraints to build indexes. Maybe that's just me. /shrug

Paul Williams 2009-08-14 21:19:55

Assuming the table is not being referenced as a Foreign Key, you've added extraneous data (the identity column) and constraints (because the pk is a unique key). A clustered key is created by default for pk's - you'd have to define indexes to support the unique keys. Your design is overcomplicated for no discernable benefit.

OMG Ponies 2009-08-14 21:37:14

I agree with you for a pure database design. But there are some small benefits to having an identity column anyway. First, you can see how many rows total have been created over the life of the table. COUNT(*) will show the current number, but if some have been deleted, you don't know how many there were. Second, an identity column shows you the order in which they were created. Third, it gives a natural way to access individual rows should you ever need to walk across them (without using a cursor). Fourth, it makes a perfect hash code should you need one on the client.

Paul Williams 2009-08-14 22:24:13

@JamesMLV, how often do these records or cross reference records change? Do you need to detect changes? If so, a timestamp column might be useful.

Paul Williams 2009-08-14 22:25:13

They won't ever change for past dates, but as months go on, the users may decide to link a ReceiptPoint against different IndexLocations

JamesMLV 2009-08-15 07:09:09

Answer 2

+1 A:

1) The primary key for ReceiptPointIndexLocation should be all four columns. However, this wouldn't stop entries with overlapping dates for the same location and receipt. There's no need for an identity column when you can identify rows with a composite key.

2) SQL:

 SELECT (SELECT t.indexlocation + ','
   FROM RECEIPTPOINTINDEXLOCATION t
  WHERE t.startdate >= @start
    AND t.enddate <= @end
FOR XML PATH(''))

OMG Ponies 2009-08-14 20:50:05

I like the answer to 1. I was thinking that using all four as PK wouldn't work because of the overlapping date issue, but maybe that isn't a problem.Your answer to 2 will get me a list of the points, but I'd have to query individually by date in order to group them all in to delimited strings.

JamesMLV 2009-08-14 21:08:56

Group into delimited strings? Could you update your question to show what you're looking for? Then I'll update my answer appropriately.

OMG Ponies 2009-08-14 21:10:52

Hm, maybe I didn't organize the question well. The part immediately under the block of code is what I'm talking about.

JamesMLV 2009-08-14 21:19:58

Do you want a list like "A, B, C"? If so, I don't think there is a built-in function for that. We had to build our own.

Paul Williams 2009-08-14 21:24:57

Updated SQL - thanks to Remus Rusanu's link.

OMG Ponies 2009-08-14 21:31:41

Nice use of FOR XML there. Much simpler than our own implementation.

Paul Williams 2009-08-14 22:01:14

Watch the trailing comma, by the way. But that's easy to remove.

Paul Williams 2009-08-14 22:06:16

Answer 3

+1 A:

There are several ways to aggegate string in SQL. You should read this article Concatenating Row Values in Transact-SQL. the most effective way is using the FOR XML PATH(', ') trick.

Remus Rusanu 2009-08-14 20:55:57

ansaurus

tags:

views:

answers:

Alternative to having an aggregate concatenation function

related questions