views:

199

answers:

1

I would like to know what is the memory usage of BitSet in Scala.For example, if I do:

  var bitArray:BitSet=new BitSet(10)
  bitArray.add(0)
  bitArray.add(2)
  bitArray.add(4)
  bitArray.add(6)
  bitArray.add(8)

How does that compare with and array containing the even numbers 0,2,4,6,8?

What about writing a number in binary:

  var bitArray:BitSet=new BitSet(32)
  bitArray.add(5)
  bitArray.add(3)
  bitArray.add(2)
  bitArray.add(1)
  bitArray.add(0)

How does that compare to the number 47?

I'm asking here of memory usage. But as a more open question, if you know, what are the advantages/disadvantages or uses of BitSet (WR to other common data types).

Thanks,

+5  A: 

You can look at the implementation of BitSet in Scala 2.8 here: scala.collection.mutable.BitSet.

It is implemented based on an array of Longs. The size of the array depends only on the highest number stored in it. Divide the highest number stored in it by 64, rounding up, and you have the size of the array. Each element in the array consumes 8 bytes.

That means that dividing by 8 the greatest number stored in it, roughly yields the size in bytes of the BitSet. "Roughly" because of virtual machine memory management overheads, because the pointer to the array also needs some memory and because the array itself has some overhead.

The order of insertion or the actual number of elements stored in the BitSet have no influence on the size of the memory allocated.

For the two examples you give, only one Long-element is required to store the numbers, using 8 bytes of memory, as in both examples the highest number is less than 64.

An array of Ints, storing any five numbers, would consume 5 * 4 bytes = 20 bytes plus overhead. For storing n numbers you need roughly n * 4 bytes.

So you are comparing (highestNumberStored / 8) bytes against (countOfNumbersStored * 4) bytes.

Ruediger Keller