views:

48

answers:

3

I have 3 columns in a DataTable

Id Name Count

1 James 4345

2 Kristen 89231

3 James 599

4 Suneel 317113

I need rows 1 and 3 gone, and the new datatable returning only rows 2 and 4. I found a really good related question in the suggestions on SO--this guy. But his solution uses hashtables, and only eliminates row 3, not both 1 and 3. Help!

+1  A: 

I tried this Remove duplicates from a datatable..

using System.Data;
using System.Linq;
...
//assuming 'ds' is your DataSet
//and that ds has only one DataTable, therefore that table's index is '0'
DataTable dt = ds.Tables[0];
DataView dv = new DataView(dt);
string cols = string.Empty;
foreach (DataColumn col in dt.Columns)
{
if (!string.IsNullOrEmpty(cols)) cols += ",";
cols += col.ColumnName;
}
dt = dv.ToTable(true, cols.Split(','));
ds.Tables.RemoveAt(0);
ds.Tables.Add(dt);

Following single line of code will avoid the duplicate rows.

ds.Tables["Employee"].DefaultView.ToTable(true,"Name");

ds – Dataset object

dt.DefaultView.ToTable( true, "Name");

dt – DataTable object

Pandiya Chendur
This didn't work, and wasn't what I wanted to do originally. I wanted to make changes to the datatable without involving any more datasets. But thanks for pointing me in the right direction. I looked at the blog, and got what I needed :)
Freakishly
A: 

How about something like this;

Pseudo code: Assuming the object has 3 properties: [Id, Name, Value] and called NameObjects and is IEnumerable (List NameObjects;)

var _newNameObjectList = new List<NameObject>();

foreach(var nameObject in NameObjecs)
{
   if(_newNameObjectList.Select(x => x.Name == nameObject.Name).ToList().Count > 0)
   {
      _newNameObjectList.RemoveAll(x => x.Name == nameObject.Name);
      continue;
   }
   else
   {
      _newNameObjectList.Add(nameObject); 
   }
}

This should work. This uses the namespace System.Linq;

Ryk
Sorry, I didn't want to use LINQ, just pure C#.
Freakishly
A: 

Okay, so I looked at the blog pointed out to me by Pandiya. In the comments section, a chap called Kevin Morris has posted a solution using a C# Dictionary, which worked for me.

In my main block, I wrote:

string keyColumn = "Website";
RemoveDuplicates(table1, keyColumn);

And my RemoveDuplicates function was defined as:

    private void RemoveDuplicates(DataTable table1, string keyColumn)
{
    Dictionary<string, string> uniquenessDict = new Dictionary<string, string>(table1.Rows.Count);
    StringBuilder sb = null;
    int rowIndex = 0;
    DataRow row;
    DataRowCollection rows = table1.Rows;
    while (rowIndex < rows.Count - 1)
    {
        row = rows[rowIndex];
        sb = new StringBuilder();
            sb.Append(((string)row[keyColumn]));


        if (uniquenessDict.ContainsKey(sb.ToString()))
        {
            rows.Remove(row);
            if (RemoveAllDupes)
            {
                row = rows[rowIndex - 1];
                rows.Remove(row);
            }
        }
        else
        {
            uniquenessDict.Add(sb.ToString(), string.Empty);
            rowIndex++;
        }
    }
}

If you go to the blog, you will find a more generic function that allows sniffing dupes over multiple columns. I've added a flag--RemoveAllDupes--in case I want to remove all duplicate rows, but this still assumes that the rows are ordered by name, and involves only duplicates and not triplicates, quadruplicates and so on. If anyone can, please update this code to reflect removal of such.

Freakishly