I have a staging table which contains a who series of rows of data which where taken from a data file.
Each row details a change to a row in a remote system, the rows are effectively snapshots of the source row taken after every change. Each row contains meta data timestamps for creation and updates.
I am now trying to build an update table from these data files which contain all of the update. I require a way to remove rows with duplicate keys keeping only the row with the latest "update" timestamp.
I am aware I can use the SSIS "sort" transform to remove duplicates by sorting on the key field and telling it to remove duplicates, but how do I ensure that the row it keeps is the one with the latest time stamp?