views:

139

answers:

6

I have a DataTable with column Value and column Name that has 30000 rows. Now i would like to get very fast Name from row that Value has specified. How to do this fast? I have like 40000 requests with various Values to this table.

I'm interested in solution in C# and not on the database side.

+3  A: 

Assuming you already have the DataTable, you can do:

DataRows row = MyDataTable.Select("Value = 7")[0];
string name = (string)row["name"];

Of course, if you are retrieving the DataTable only for this purpose, it is better to only select the data you need from the database. If this is data you are keeping in memory, then I recommend you go with JMarsch's suggestion of using a Dictionary.

RedFilter
+3  A: 

Where is the data in the datatable coming from? It is always way faster to do filtering databse side rather than retreive the whole databse table into a datatable and then filter there. It is orders of magnitude faster in the database.

Ben Robinson
It can come from XML, from database or from parsed plain text. So i can not do this optimization on database unfortunately.
tomaszs
+1  A: 

Do NOT use the DataTable - hit the database with the request. Seriously. You want something with an index, optimized for dealing with what it considers small amounts of data.

TomTom
My data come from XML, database or from plain text, so I can not pass this task to database unfortunately.
tomaszs
Load from XML to optimized classes. Saves lot of memory - and in this case memory = performance. When a class turns "dirty", put it into a list of classes to write out.
TomTom
+3  A: 

You could use a DataView, and set the sort order to the key. This will cause an in-memory index to be build.

However if this is just a simple datatable with only a key and a value, I would recommend that you use a Dictionary<TKey, TValue> or a HashTable instead. You will get way better performance with a Dictionary (in testing, we've loaded millions of items into a dictionary, and gotten subsecond lookups -- the performance for lookups is either O(1), or O(logn), I can't remember which -- either way, it's crazy fast.

JMarsch
O(1) for Dictionary. O(logn) for SortedDictionary
digEmAll
@digmEmAll Thanks for clearing that up!
JMarsch
+1  A: 

You have a couple of options for doing client-side searching:

  1. Use a DataView

    DataView view = new DataView(table);
    view.Sort = "Value asc";
    int index = view.Find(value);
    // you now have the index of the row in question, or -1 if it was not found
    
  2. Use a Dictionary. This will be somewhat faster, but requires more work up-front and for maintenance. Assuming that Value is an integer column and Name is a string,

    Dictionary<int, string> lookup = table.Rows.ToDictionary(
        r => (int)r["Value"], 
        r => (string)r["Name"]);
    string name = lookup[value];
    

    Or, if there may be values that don't exist,

    string name;
    if(lookup.TryGetValue(value, out name)) ...
    

Both of these options will be fairly fast, though the Dictionary will likely be faster. Its only drawback is that you'll have to keep it in sync with your table as changes take place (assuming that changes can take place).

Obviously it's easier for the database itself to do this filtering, but I'll leave the decision as to whether or not this should be done client-side to you.

Adam Robinson
+1  A: 

In .Net, DataTables are expensive structures. An easier and more efficient construct is the Dictionary. You can define it like this:

    System.Collections.Generic.Dictionary<string, string> nameValueList = new Dictionary<string, string>();

and then you can load it like this:

nameValueList.Add("name1", "value1");

...assuming that the names are unique, otherwise you will get an 'ArgumentException'.
And finally you can call on values like this:

string res = nameValueList["name1"];

I think this is one of the fastest implementations, if the number of expected transactions justifies the initial overhead of transforming your data.

Marwan