views:

794

answers:

2

I want to remove duplicates from my DataTable so I'm using DataTable.AsEnumerable().Distinct(DataRowComparer.Default) but it doesn't do what I need. I think because each duplicate row has it's unique primary key column.

How can I do what I need? Write my own DataRowComparer? I don't want - because the default must works.

A: 

Maybe you could use the defaultview.rowfilter to launch a query that group by unique columns, and SELECT the MIN (or MAX) RowId as the row to keep.

Look at this question for more info abour the query.

Jonathan
+3  A: 

You can use DistinctBy from the MoreLINQ project. Basically you specify a projection from a data row to the columns you're interested in, and it will use that. For example:

var rows = DataTable.AsEnumerable()
                    .DistinctBy(row => new { Name = row["Name"],
                                             Age = row["Age"] });

When you say "the default must work" that's basically not going to happen if you have a normal primary key column. Two rows with the different primary key values aren't duplicates of each other, because they differ in that data.

Another option would be to project to data rows without that primary key column, and then use the normal Distinct method.

Jon Skeet
MoreLINQ is cool. Thanks for the pointer to it!
Justin Grant