views:

45

answers:

1

I've been trying to estimate the size of an Access table with a certain number of records.

It has 4 Longs (4 bytes each), and a Currency (8 bytes).

In theory: 1 Record = 24 bytes, 500,000 = ~11.5MB

However, the accdb file (even after compacting) increases by almost 30MB (~61 bytes per record). A few extra bytes for padding wouldn't be so bad, but 2.5X seems a bit excessive - even for Microsoft bloat.

What's with the discrepancy? The four longs are compound keys, would that matter?

+2  A: 

This is the result of my tests, all conducted with an A2003 MDB, not with A2007 ACCDB:

      98,304 IndexTestEmpty.mdb
     131,072 IndexTestNoIndexesNoData.mdb
  11,223,040 IndexTestNoIndexes.mdb
  15,425,536 IndexTestPK.mdb
  19,644,416 IndexTestPKIndexes1.mdb
  23,838,720 IndexTestPKIndexes2.mdb
  24,424,448 IndexTestPKCompound.mdb
  28,041,216 IndexTestPKIndexes3.mdb
  28,655,616 IndexTestPKCompoundIndexes1.mdb
  32,849,920 IndexTestPKCompoundIndexes2.mdb
  37,040,128 IndexTestPKCompoundIndexes3.mdb

The names should be pretty self-explanatory, I think. I used an append query with Rnd() to append 524,288 records of fake data, which made the file 11MBs. The indexes I created on the other fields were all non-unique. But if you see the compound 4-column index increased the size from 11MBs (no indexes) to well over 24MBs. A PK on the first column only increased the size only from 11MBs to 15.4MBs (using fake MBs, of course, i.e., like hard drive manufacturers).

Notice how each single-column index added approximately 4MBs to the file size. If you consider that 4 columns with no indexes totalled 11MBs, that seems about right based on my comment above, i.e., that each index should increase the file size by about the amount of data in the field being indexed. I am surprised that the clustered index did this, too -- I thought that the clustered index would use less space, but it doesn't.

For comparison, a non-PK (i.e., non-clustered) unique index on the first column, starting from IndexTestNoIndexes.mdb is exactly the same size as the database with the first column as the PK, so there's no space savings from the clustered index at all. On the off chance that perhaps the ordinal position of the indexed field might make a difference, I also tried a unique index on the second column only, and this came out exactly the same size.

Now, I didn't read your question carefully, and omitted the Currency field, but if I add that to the non-indexed table and the table with the compound index and populate it with random data, I get this:

      98,304 IndexTestEmpty.mdb
     131,072 IndexTestNoIndexesNoData.mdb
  11,223,040 IndexTestNoIndexes.mdb
  15,425,536 IndexTestPK.mdb
  15,425,536 IndexTestIndexUnique2.mdb
  15,425,536 IndexTestIndexUnique1.mdb
  15,482,880 IndexTestNoIndexes+Currency.mdb
  19,644,416 IndexTestPKIndexes1.mdb
  23,838,720 IndexTestPKIndexes2.mdb
  24,424,448 IndexTestPKCompound.mdb
  28,041,216 IndexTestPKIndexes3.mdb
  28,655,616 IndexTestPKCompoundIndexes1.mdb
  28,692,480 IndexTestPKCompound+Currency.mdb
  32,849,920 IndexTestPKCompoundIndexes2.mdb
  37,040,128 IndexTestPKCompoundIndexes3.mdb

The points of comparison are:

  11,223,040 IndexTestNoIndexes.mdb
  15,482,880 IndexTestNoIndexes+Currency.mdb
  24,424,448 IndexTestPKCompound.mdb
  28,692,480 IndexTestPKCompound+Currency.mdb

So, the currency field added another 4.5MBs, and its index added another 4MBs. And if I add non-unique indexes to the 2nd, 3rd and 4th long fields, the database 41,336,832, and increase in size of just under 12MBs (or ~4MBs per additional index).

So, this basically replicates your results, no? And I ended up with the same file sizes, roughly speaking.

The answer to your question is INDEXES, though there is obviously more overhead in the A2007 ACCDB format, since I saw an increase in size of only 20MBs, not 30MBs.

One thing I did notice was that I could implement an index that would make the file larger, then delete the index and compact, and it would return to exactly the same file size as it had before, so you should be able to take a single copy of your database and experiment with what removing the indexes does to your file size.

David-W-Fenton