views:

166

answers:

4

Stackoverflowers,

I have a resultset from a SQL query in the form of:

Category  Column2   Column3
A         2        3.50  
A         3        2  
B         3        2  
B         1        5  
...

I need to group the resultset based on the Category column and sum the values for Column2 and Column3. I have to do it in code because I cannot perform the grouping in the SQL query that gets the data due to the complexity of the query (long story). This grouped data will then be displayed in a table.

I have it working for specific set of values in the Category column, but I would like a solution that would handle any possible values that appear in the Category column.

I know there has to be a straightforward, efficient way to do it but I cannot wrap my head around it right now. How would you accomplish it?

EDIT

I have attempted to group the result in SQL using the exact same grouping query suggested by Thomas Levesque and both times our entire RDBMS crashed trying to process the query.

I was under the impression that Linq was not available until .NET 3.5. This is a .NET 2.0 web application so I did not think it was an option. Am I wrong in thinking that?

EDIT

Starting a bounty because I believe this would be a good technique to have in the toolbox to use no matter where the different resultsets are coming from. I believe knowing the most concise way to group any 2 somewhat similar sets of data in code (without .NET LINQ) would be beneficial to more people than just me.

EDIT

Here is the solution I came up with in VB.NET in case anyone needs it. It uses the answer by Paul Williams as the starting point. I am taking the values directly from a datareader.:

Public Class Accumulator
    Public sum1 As Integer
    Public sum2 As Decimal
End Class

If IReader.HasRows Then
    Dim grouping As New Dictionary(Of String, Accumulator)

    Do While IReader.Read
        Dim sum As New Accumulator

        If grouping.ContainsKey(IReader.GetString(0)) Then
            sum = grouping.Item(IReader.GetString(0))
        Else
            sum = New Accumulator
            grouping.Item(IReader.GetString(0)) = sum
        End If

        sum.sum1+= IReader.GetInt32(1)
        sum.sum2 += IReader.GetInt32(2)
    Loop

    For Each key As KeyValuePair(Of String, Accumulator) In grouping
        "DO WHAT YOU NEED TO DO WITH THE VALUES HERE"
    Next
End If
+5  A: 

I cannot perform the grouping in the SQL query that gets the data due to the complexity of the query (long story)

Are you sure about that ? you just need to put a SELECT ... GROUP BY ... statement around your complex query :

SELECT Category, SUM(Column2), SUM(Column3)
FROM ( /* your query here */ )
GROUP BY Category

Anyway, if you really want to do it in code, the easiest way is to use Linq. Assuming that the results are stored in a list of objects :

var groupedByCategory =
                    from r in results
                    group r by r.Category into g
                    select new
                    {
                        Category = g.Key,
                        SumOfColumn2 = g.Sum(x => x.Column2),
                        SumOfColumn3 = g.Sum(x => x.Column3)
                    };

UPDATE

I have attempted to group the result in SQL using the exact same grouping query suggested by Thomas Levesque and both times our entire RDBMS crashed trying to process the query.

Uh... which DBMS are you using ? just to make sure I never use it accidentally ;)

I was under the impression that Linq was not available until .NET 3.5. This is a .NET 2.0 web application so I did not think it was an option. Am I wrong in thinking that?

No, you're correct. Linq comes with .NET 3.5 and is not available in earlier versions.

However, if you you happen to be using VS2008 to build an application targeting .NET 2.0, you might be interested in LinqBridge : it's an alternative implementation of the standard Linq operators, which doesn't depend on .NET 3.5. You just need a C# 3 compiler (which comes with VS2008)

Thomas Levesque
Please see the edits I made to the original question. I was thinking that using Linq would be the suggested method to handle this, but I didn't think it was available in .NET 2.0.
NinjaBomb
See my updated answer
Thomas Levesque
The DBMS is Pervasive, which can usually handle and process whatever I throw at it. I guess the complexity of this one throws it for a loop (or crash in this case). Any suggestions on how to accomplish the grouping that are non Linq related?
NinjaBomb
I was under the impression that you could use LINQ in 2.0 SP1 as long as you have a 3.5 compiler and target the 2.0 framework (under project->properties)
BlueRaja - Danny Pflughoeft
@BlueRaja, yes, but you also need to reference the LinqBridge assembly, because Linq operators are implemented in System.Core, which is not available in 2.0
Thomas Levesque
A: 

The definition of group is "on any change in value" So, have a variable which stores the current and previous category. While looping through your results, store current sum, and reset on any change in category value. at the end of loop, or before the next read, set the prev value to the current value. This way you don't need to know what categories in advance.

Jeremy
+2  A: 

You could create a Dictionary of categories to an object that would hold the sums. For example:

public class Accumulator
{
    public decimal SumColumn2;
    public decimal SumColumn3;
}

Dictionary<string, Accumulator> grouping = new Dictionary<string, Accumulator>;
DataTable dt = ... // this is the ungrouped DataTable
foreach (DataRow dr in dt.Rows)
{
    string category = dr["Category"];
    decimal col2 = dr["Column2"];
    decimal col3 = dr["Column3"];
    Accumulator sum = grouping[category];
    if (sum == null)
    {
        sum = new Accumulator();
        grouping[category] = sum;
    }
    sum.SumColumn2 += col2;
    sum.SumColumn3 += col3;
}

From here, you could work out a more general solution if necessary.

Paul Williams
A: 

Have you tried to sort the DataTable by using the DefaultView? See MSDN.

void SumDataTable() {
    MyDataTable.DefaultView.Sort = "ColumnName ASC";
    foreach( DataRowView drv in MyDataTable.DefaultView ) {
        MyDataRow dr = drv.Row as MyDataRow;    // get your DataRow type
        // results here, search in a list of custom summing classes, 
        // or use a Dictionary style approach
}

Having a sorted view like this hopefully eases some of the debugging pain, and if you then combine it with the Dictionary style answer your problem should be solved.

Byron Ross