views:

369

answers:

3

The worksheets have hundreds of rows with account numbers in column A, an account description in column B and totals in column C. I want to copy the rows from all 3 worksheets into a single 4th worksheet but where duplicate account numbers are found, I want there just to be one with the totals aggregated into column C of that row and the extras deleted, like this:

Input from sheets (all the sheets are in one .xls file):

Sheet 1 of workbook

                A                     B                       C
1            abc-123            Project Costs             1,548.33
2            abc-321           Housing Expenses                250
3            abc-567           Helicopter Rides          11,386.91

Sheet 2 of workbook

                A                     B                       C
1            abc-123            Project Costs             1,260.95
2            abc-321           Housing Expenses                125
3            abc-567           Helicopter Rides          59,605.48

Sheet 3 of workbook

                A                     B                       C
1            abc-123            Project Costs             1,785.48
2            abc-321           Housing Expenses                354
3            def-345            Elephant Treats         814,575.31

What I would want the result to be:

                A                     B                       C
1            abc-123            Project Costs             4,642.28
2            abc-321           Housing Expenses                729
3            abc-567           Helicopter Rides          70,992.39
4            def-345            Elephant Treats         814,575.31

Notice: Some of the account numbers don't ever repeat, but some do.

+2  A: 

Really what you want to do is run a macro or whatever that copies all your data from the three sheets onto a new sheet, then runs a pivot table on the result. Pivot tables handle the unique-ification of your data set and the aggregation of data for multiplicities.


You can use the following VB code (type Alt-F11 in Excel to get to the VBA editor, insert a new module, and paste this code into it). This code assumes your spreadsheet has three sheets named Sheet1, Sheet2, and Sheet3 that contain your data, and that the data is contiguous and starts in cell A1 on each sheet. It also presumes your spreadsheet has a sheet named "Pivot Sheet" which is where the data will all get copied into.

Sub CopyDataToPivotSheet()

  Sheets("Pivot Sheet").Select
  Range("A1:IV65536").Select
  Selection.Clear

  Sheets("Sheet1").Select
  Range("A1").Select
  Range(Selection, Selection.End(xlToRight)).Select
  Range(Selection, Selection.End(xlDown)).Select
  Selection.Copy
  Sheets("Pivot Sheet").Select
  Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
      :=False, Transpose:=False

  Sheets("Sheet2").Select
  Range("A1").Select
  Range(Selection, Selection.End(xlDown)).Select
  Range(Selection, Selection.End(xlToRight)).Select
  Application.CutCopyMode = False
  Selection.Copy
  Sheets("Pivot Sheet").Select
  Range("A1").Select
  Selection.End(xlDown).Offset(1, 0).Range("A1").Select
  Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
      :=False, Transpose:=False

  Sheets("Sheet3").Select
  Range("A1").Select
  Range(Selection, Selection.End(xlToRight)).Select
  Range(Selection, Selection.End(xlDown)).Select
  Application.CutCopyMode = False
  Selection.Copy
  Sheets("Pivot Sheet").Select
  Selection.End(xlDown).Select
  Range("A1").Select
  Selection.End(xlDown).Offset(1, 0).Range("A1").Select
  Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
      :=False, Transpose:=False

  Rows("1:1").Select
  Application.CutCopyMode = False
  Selection.Insert Shift:=xlDown
  Range("A1").Select
  ActiveCell.FormulaR1C1 = "AccountNum"
  Range("B1").Select
  ActiveCell.FormulaR1C1 = "Description"
  Range("C1").Select
  ActiveCell.FormulaR1C1 = "Total"

End Sub

This is 95% excel generated code (via Record Macro), but I changed up some stuff to make it more generic. So anyway, you can then assign that Macro to a button in the usual way, or you can assign it to a keyboard shortcut via the Tools => Macro => Macros... Options... dialog.

Anyway, that will get your data aggregated onto the Pivot Sheet sheet with appropriate headings.

Then you can go to Data => PivotTable and PivotChart Report. Hit Next, select the data on the Pivot Sheet (including the headings!), hit Next, choose Layout.

Drag the AccountNumber field (on the right of the wizard) into the area labelled "Row". Drag the Description field to under the Account Number field in the "Row" area. Drag the Total field into the "Data" area, then double click on it in the "Data" area and choose "Sum" so that it aggregates this field. Hit OK and you should get a Pivot Table. You're probably going to want to Hide the sub-totals by right clicking on the sub-total title (i.e. "blah blah Total") and clicking Hide. That result looks basically exactly like what your desired output is.

If you wanted to get fancy, you could conceivably automate that last paragraph, but it's probably not worth it.

Hope this helps!

vicatcu
Oh, perfect. Yea macro is what I meant. I thought it was called VBScript but somebody edited my title to say VBA instead. Is that right? How do I do all that which you mentioned?
orokusaki
adding more detail
vicatcu
also, the VBA code is just 'helpful' above, it's not required, as you could always copy the data into the Pivot Sheet by hand as well.
vicatcu
+1 vote for yours too, thanks so much.
orokusaki
+2  A: 

I think ADO is best for this, you will find some notes here: http://stackoverflow.com/questions/1492048/function-for-detecting-duplicates-in-excel-sheet/1492637#1492637

You can use a suitable SQL string to join and group your records.

For example:

strSQL = "SELECT F1, F2, Sum(F3) FROM (" _
       & "SELECT F1,F2,F3 FROM [Sheet1$] " _
       & "UNION ALL " _
       & "SELECT F1,F2,F3 FROM [Sheet2$] " _
       & "UNION ALL " _
       & "SELECT F1,F2,F3 FROM [Sheet3$] ) " _
       & "GROUP BY F1, F2"
Remou
+3  A: 

Here's one way.

Option Explicit

Sub Test()
    Dim sheetNames: sheetNames = Array("Sheet1", "Sheet2", "Sheet3")
    Dim target As Worksheet: Set target = Worksheets("Sheet4")
    Dim accounts As New Dictionary
    Dim balances As New Dictionary
    Dim source As Range
    Dim row As Range
    Dim id As String
    Dim account As String
    Dim balance As Double
    Dim sheetName: For Each sheetName In sheetNames
        Set source = Worksheets(sheetName).Range("A1").CurrentRegion
        Set source = source.Offset(1, 0).Resize(source.Rows.Count - 1, source.Columns.Count)
        For Each row In source.Rows
            id = row.Cells(1).Value
            account = row.Cells(2).Value
            balance = row.Cells(3).Value
            accounts(id) = account
            If balances.Exists(id) Then
                balances(id) = balances(id) + balance
            Else
                balances(id) = balance
            End If
        Next row
    Next sheetName

    Call target.Range("A2:A65536").EntireRow.Delete

    Dim rowIndex As Long: rowIndex = 1
    Dim key
    For Each key In accounts.Keys
        rowIndex = rowIndex + 1
        target.Cells(rowIndex, 1).Value = key
        target.Cells(rowIndex, 2).Value = accounts(key)
        target.Cells(rowIndex, 3).Value = balances(key)
    Next key
End Sub
  1. Create a new module (VBA editor -> Insert -> Module) and paste the above code into it.

  2. Add a reference to Microsoft Scripting Runtime (VBA editor -> Tools -> References -> Check 'Microsoft Scripting Runtime').

  3. Run it by placing the cursor within the code and pressing F5.

Obviously the sheets will have to be named Sheet1, Sheet2, Sheet3 and Sheet4. It won't paste the column headers into Sheet4 but presumably they are static so you can just set them up yourself beforehand.

AdamRalph
Worked!!!!! Thanks, but it output some of this: 9.09495E-15. I failed to mention that some of the inputs in the "total" section are 0 but they show up in excel as a dash and if I double click the field I can see a 0. All the fields that had that, when coming in to the new file turned into that E-24 stuff.
orokusaki
Oh, I just changed Double to Currency instead and it worked. Thanks so much man. Your code is amazing (you would like Python I think based on your code's simplicity).
orokusaki
Just did 6 vote ups on other questions.
orokusaki
thanks very much for the upvotes. I am indeed a fan of Python. I've been playing with IronPython recently since I specialise in .NET these days. the IronPython tooling is a bit limited right now but I'm keen to see it improve. I'd like to play around with pure Python at some stage too if and when I get the time
AdamRalph