I have an application which needs to create a snapshot of record identifiers which can be persisted in the database server. I currently have a table that looks like this:
CREATE TABLE Workset (
Workset_id int,
Sequence int,
Record_id int
)
which I insert into using something like:
INSERT INTO WorkSet (Workset_id, Sequence, Record_id)
SELECT TOP 100
@Workset_id,
ROW_NUMBER() OVER (
ORDER BY -- this may be a complex ordering clause
),
Record_id
FROM SalesRecords
-- WHERE some conditions are met
Later on, I can query for the record identifier of a specific entry in the working set by the identifier and the row number
SELECT SalesRecords.* FROM SalesRecords
JOIN WorkSet ON WorkSet.Record_Id = SalesRecords.Record_id
WHERE Workset_id = @Workset_id AND Sequence = @Sequence
The problem with this is that as the snapshot gets large the amount of data that I have to write into the workset data grows quickly. It wouldn't be uncommon to have a workset in the millions of records and if each of those items requires 12 bytes of row storage it adds up quickly.
It seems that a more compact way to represent the data would be to store simply a Workset_id and a varbinary column containing all of the record identifiers in sequence order. This would avoid the overhead of repeating the workset identifier for every row in the same workset, and avoide the need to store the sequence numbers (because they are implied by the position in the varbinary).
Is there any reasonable way that I can transform my INSERT query into something that generates a varbinary of the record identifiers in order?