views:

414

answers:

1

I'm looking for a very compact way of storing a dense variable length bitarray in Java. Right now, I'm using BitSet, but it seems to use on average 1.5*n bits of storage space for a bit vector of size n. Typically, this isn't a problem, but in this case the bitarrays being stored are a pretty significant part the memory footprint of the application. So, it would really help to get them to be a little bit smaller.

The space required by BitSet seems to be due to the fact that the array of longs used to back the data structure tends to double each time it is expanded to hold more bits:

// BitSet's resizing code
private void ensureCapacity(int wordsRequired) {
  if (words.length < wordsRequired) {
    // Allocate larger of doubled size or required size
    int request = Math.max(2 * words.length, wordsRequired);
    words = Arrays.copyOf(words, request);
    sizeIsSticky = false;
  }
}

I could write my own alternative implementation of BitSet that scales the backend data structure more conservatively. But, I'd really hate to duplicate functionality that is already in the standard class libraries if I don't have to.

+7  A: 

If you create the BitSet using the constructor BitSet(int nbits) you can specify the capacity. If you guess the capacity wrong, and go over, it will double the size.

The BitSet class does have a trimToSize method which is private, and is called by writeObject and clone(). If you clone your object, or serialize it, it will trim it to the correct length (assuming the class over expanded it through the ensureCapacity method).

brianegge
Yup. Note you don't actually need to use the copied version. The original is trimmed(!).
Tom Hawtin - tackline
That's pretty clever. Thanks!
dmcer