views:

40

answers:

3

Lets say you have a DataTable that has columns of "id", "cost", "qty":

DataTable dt = new DataTable();
dt.Columns.Add("id", typeof(int));
dt.Columns.Add("cost", typeof(double));
dt.Columns.Add("qty", typeof(int));

And it's keyed on "id":

dt.PrimaryKey = new DataColumn[1] { dt.Columns["id"] };

Now what we are interested in is the cost per quantity. So, in other words if you had a row of:

id | cost  | qty
----------------
42 | 10.00 | 2

The cost per quantity is 5.00.

My question then is, given the preceeding table, assume it's constructed with many thousands of rows, and you're interested in the top 3 cost per quantity rows. The information needed is the id, cost per quantity. You cannot use LINQ.

In SQL it would be trivial; how BEST (most efficiently) would you accomplish it in C# without LINQ?

Update: Seeking answers that do not modify the table.

+1  A: 

I'm not sure if this is best, but it beats sorting and then picking the top three elements which has time complexity O(n log n).

You can use a priority queue to filter the top three elements. Information about .Net priority queue implementations is available here.

The basic idea is to insert the first three elements of your data table into the priority queue. You then successively add all of remaining elements, removing the top element after each add. The elements remaining in the priority queue (heap) after that will be the top three elements.

No modification to the table is needed, in terms of adding another column (you just need to define the relative ordering / priority criteria) and doesn't change the order the table elements. Time complexity will be O(n log 3) = O(n).

andand
That's a great idea, and very much in line for what I'm getting at with the question.
byte
+1  A: 

Add a column:

dt.Columns["CostPerQty"].Expression = "cost / qty";

Then sort:

dt.DefaultView.Sort = "CostPerQty desc";

The just get the top 3 rows.

BFree
That would work, but for the sake of curiosity I'm looking for an alternative that doesn't alter the table. Upvoting you and editing the question.
byte
A: 

I like BFree's solution, but, providing you don't want an extra column in your dataset for some reason, is there a reason you can't initiate a call to your db and execute a stored procedure for the results?

Alternatively, don't use a DataTable, but parse objects from your ADO.NET results and create an IEnumerable<T> (or List, or array, or whatever) of those objects. Then just sort them? I prefer the object solution (even without using Entities, or Linq2SQL, or such) just because it gives me so much more flexability on what I can do with a row once I have it...

AllenG
No doubt, but the question is on DataTable, and that only. If this could be done in any data structure, it would defeat the purpose behind the question. (That's kind of why I posted the no LINQ at the bottom, but I really meant it more generally.)
byte