tags:

views:

362

answers:

3

I have a specific case in mind, but the question applies in general too. How do you deal with data in Excel when the amount of data is arbitrary?

In my specific case, I have a program which generates between 1 and 10 sets of data, each set consisting of 5 arbitrarily (but equally) long arrays (or you could consider it a table with 5 columns). I would like to be able to dump this data into Excel, apply named ranges to it (this much, I already have done), and then manipulate it in Excel to create a report. Ideally, I would like to do this with as little VBA as possible (none would be best). The idea is that the end users of these reports should be able to change the format (or generate a whole new report of the same data) without me having to change my program.

Basically, the reports should be something like 1-10 tables, with one row for each element in the 5 arrays (a column for each array).

Hopefully this example makes it clear what I'm asking. What options does Excel give you for dealing with data in such arbitrary quantities, aside from coding the whole darn thing in VBA?

+1  A: 

You could put all of the data from all of the data sets into a single large range on a single worksheet. You could also have a column identifying which dataset a particular row came from

For reporting, if you need to summarise some aspects of the data then a Pivot Table (on the Data menu) is the logical choice. You can use the dataset column as a page field in the pivot table so that each dataset can be shown on a separate report

If you need to filter the datasets before generating reports, you can also use Advanced Filter (also on the Data menu) to filter the appropriate records to a new sheet. You could then generate Pivot Tables from that new sheet

There are some risk factors:

  • the size of the datasets may exceed the maximum number of rows on a sheet (1,048,576 in Excel 2007; 65,536 in Excel 2003)
  • the number of different values in a given field may exceed the number which can be summarised in a Pivot Table (1,048,576 in Excel 2007; 32,500 in Excel 2003)
  • users may find Advanced Filter and/or Pivot Table to be overly difficult to use

If your data is likely to exceed any of the maximum sizes then Excel is unlikely to be a suitable program to use for that data (except as a front-end to a database of some kind)

barrowc
I can basically fit all the data on a single printed sheet in 99.99% of cases, so size is not really an issue. The thing is, I don't need any kind of summary data. I just need the data to be in a nicer form. It's just a report.
Daniel Straight
+1  A: 

I'm not sure I understand "and then manipulate it in Excel to create a report" precisely as I can imagine a plethora of disparate ways to generate reports in Excel, but I have a suggestion that might hit the mark for you (I'd have left a comment with clarifying questions, but alas my rep is still too low!). If you want clarifications / example code, let me know and I'd be happy to oblige.

Since you already have named ranges, you could use a queryTable/listObject that actually points back to another worksheet in the same workbook. This uses the Excel Files DSN (through ODBC, and it should be on all workstations with Excel) and I've use this trick a number of times to deal with variable data that I needed to report on. How you do this depends on the version of Excel you're running. I happen to be running 2007 right now, so the directions for that are "Data Tab / From Other Sources / From Microsoft Query", choose "Excel Files", browse out to your file (you may have to save first, as this is the same one you're in) and choose the named range you're after. If you're familiar with MS Query, it should be straight forward from there.

Now, the DSN for the new queryTable will have a hard-coded reference to the workbook in the connection string. To update this you will need a tiny bit of VBA code in the workbook_open event. It should look something like:

Sub Workbook_Open()
   Dim MyLocation as String
   MyLocation = ThisWorkbook.Path & "\" ThisWorkbook.Name
   Sheet1.ListObjects(1).queryTable.ConnectionString = _
         "ODBC;DSN=Excel Files;DBQ=" & _
         MyLocation & ";DriverId=1046;MaxBufferSize=2048;PageTimeout=5;"
End Sub

Obviously with multiple queries you'd need to do some For Eaching, but that's the gist. Let me know if this is what you're looking for, and I can give you a more in-depth explanation of the benefits, approach, code samples to create queries (instead of hand making them through MSQuery), different directions for other versions, etc.

Note: This approach would also work with a PivotTable, but I find them cumbersome for simple data sets like the data you are describing.

TimS
There can be up to about 5 dozen named ranges, and unless I'm missing something, trying to add them all just results in "too many fields."
Daniel Straight
Oh! It wasn't apparent to me that your values were in horizontal arrays. Excel doesn't deal with variable width named ranges as well as it does data in columns. Do each of the rows represent some type of data/entity? You mention that each dataset will have the same number of columns; do these columns 'line-up' (ie, are they they same type of data)? If you can re-org it to be in a more tabular format, you'll have an easier time of it in XL. Depending on your format, maybe an Index function would work? =INDEX(rngFirstArray, 1, 1)which gets you the first value in your array.
TimS
The Index function works fine with the horizontal arrays. The problem is that if I'm writing INDEX(array, 1, 1) or just INDEX(array, 1) for horizontal, then I have to know when to stop. Do I write up to INDEX(array, 10) or INDEX(array, 15)? The named ranges are applied to the whole row, but they could just as easily be applied to just the elements being used. They're applied by a .NET program that creates the sheet. I can reorganize the data into columns, sure, I just don't see how that helps if I still have to know how many positions to reference with the Index function.
Daniel Straight
Again, I don't know whether this is appropriate for your data, or even what your report layout looks like, but if you have a data set in tabular format with each named range instead as a column (formally a row), then a named range applied to the whole data set, then you can query that new named range (as in my post). This would remove any worries about arbitrary row counts. Another alternative, I suppose, could be expanding/contracting INDEX ranges based on the known number of rows. You should be able to do this straight from .NET, but could get overly complex depending on the report layout.
TimS
The report layout is arbitrary. That's the whole point of this question. I want the end-users of the report to be able to change it on a whim without having to do any coding.
Daniel Straight
I think I understand, but the way Excel 'best' deals with shifting values is through queryTables and/or listObjects. So if you're looking for a 'best practice', you'll need to organize your data in a way that is conducive to using queries, or even better putting it into a table(listObject) to begin with.
TimS
Bah! Still can't add a comment. Good work on finding a solution to your issue! If you've resigned yourself to writing a macro (to hide the rows that are blank), you could go the next step of auto filling the range up and down depending on the number of rows needed. Great work and great find!
TimS
A: 

This could potentially be useful:

http://www.exceluser.com/solutions/variablelists_xl12.htm

I ended up going with this technique. I just copied down my formulas further than I could possibly ever need and wrote a little VBA function to hide the rows that weren't being used. It's not the most elegant solution, but it gets the job done.

Daniel Straight
I don't really like to do it, but since I did end up going with the technique I found, I have to mark my answer as the answer.
Daniel Straight