views:

332

answers:

4

For my university process I'm simulating a process called random sequential adsorption. One of the things I have to do involves randomly depositing squares (which cannot overlap) onto a lattice until there is no more room left, repeating the process several times in order to find the average 'jamming' coverage %.

Basically I'm performing operations on a large array of integers, of which 3 possible values exist: 0, 1 and 2. The sites marked with '0' are empty, the sites marked with '1' are full. Initially the array is defined like this:

int i, j;
int n = 1000000000;
int array[n][n];

for(j = 0; j < n; j++)
{
    for(i = 0; i < n; i++)
    {
        array[i][j] = 0;
    }
}

Say I want to deposit 5*5 squares randomly on the array (that cannot overlap), so that the squares are represented by '1's. This would be done by choosing the x and y coordinates randomly and then creating a 5*5 square of '1's with the topleft point of the square starting at that point. I would then mark sites near the square as '2's. These represent the sites that are unavailable since depositing a square at those sites would cause it to overlap an existing square. This process would continue until there is no more room left to deposit squares on the array (basically, no more '0's left on the array)

Anyway, to the point. I would like to make this process as efficient as possible, by using bitwise operations. This would be easy if I didn't have to mark sites near the squares. I was wondering whether creating a 2-bit number would be possible, so that I can account for the sites marked with '2'.

Sorry if this sounds really complicated, I just wanted to explain why I want to do this.

+7  A: 

You can't create a datatype that is 2-bits in size since it wouldn't be addressable. What you can do is pack several 2-bit numbers into a larger cell:

   struct Cell {
      a : 2;
      b : 2;
      c : 2;
      d : 2;
   };

This specifies that each of the members a, b, c and d should occupy two bits in memory.

EDIT: This is just an example of how to create 2-bit variables, for the actual problem in question the most efficient implementation would probably be to create an array of int and wrap up the bit fiddling in a couple of set/get methods.

Andreas Brinck
but how will you address them by index?
Andrey
@Andrey You'll have to write some kind of wrapper function around this.
Andreas Brinck
How would I wrap up the bit fiddling?
Eddy
+2  A: 

Instead of a two-bit array you could use two separate 1-bit arrays. One holds filled squares and one holds adjacent squares (or available squares if this is more efficient).

I'm not really sure that this has any benefit though over packing 2-bit fields into words. I'd go for byte arrays unless you are really short of memory.

Dipstick
He has a billion x billion array - even if each is just a byte (8 bit number), that's still... a lot of memory.
Kirk Broadhurst
But cutting it from 8 to 2 bits is only a factor of 4. It MAY mean the difference between "fitting in RAM" and "needing swap", so in border cases, it's probably worth it.
Vatine
+1  A: 

You can compact one dimension of array into sub-integer cells. To convert coordinate (lets say x for example) to position inside byte:

byte cell = array[i][ x / 4 ];
byte mask = 0x0004 << (x % 4);
byte data = (cell & mask) >> (x % 4);

to write data do reverse

Andrey
Seems a little redundant to shift the mask up when you are just going to shift it down again (also, the mask only covers 1 bit, should be '3'). Could do as "data = (array[i][x/4] >> (x "
Grant Peters
ok, agree :) but writing process will look a bit different from reading.
Andrey
+2  A: 

The basic idea

Unfortunately, there is no way to do this in C. You can create arrays of 1 byte, 2 bytes, etc., but you can't create areas of bits.

The best thing you can do, then, is to write a new library for yourself, which makes it look like you're dealing with arrays of 2 bits, but in reality does a lot of hard work. The same way that the string libraries give you functions that work on "strings" (which in C are just arrays), you'll be creating a new library which works on "bit arrays" (which in reality will be arrays of integers, with a few special functions to deal with them as-if they were arrays of bits).

NOTE: If you're new to C, and haven't learned the ideas of "creating a new library/module", or the concept of "abstraction", then I'd recommend learning about them before you continue with this project. Understanding them is IMO more important than optimizing your program to use a little less space.

How to implement this new "library" or module

For your needs, I'd create a new module called "2-bit array", which exports functions for dealing with the 2-bit arrays, as you need them.

It would have a few functions that deal with setting/reading bits, so that you can work with it as if you have an actual array of bits (you'll actually have an array of integers or something, but the module will make it seem like you have an array of bits).

Using this module would like something like this:

// This is just an example of how to use the functions in the twoBitArray library.
twoB my_array = Create2BitArray(size); // This will "create" a twoBitArray and return it.
SetBit(twoB, 5, 1); // Set bit 5 to 1 // 
bit b = GetBit(twoB, 5); // Where bit is typedefed to an int by your module.

What the module will actually do is implement all these functions using regular-old arrays of integers.

For example, the function GetBit(), for GetBit(my_arr, 17), will calculate that it's the 1st bit in the 4th integer of your array (depending on sizeof(int), obviously), and you'd return it by using bitwise operations.

Edan Maor
Sorry I'm a bit of a newb to C. Can you clarify what you mean? Why do we set bit 5 to 1? Which libraries do I use?
Eddy
I recommend writing your own library to deal with the whole idea of "two bit arrays". I updated my answer, hope it helps now.
Edan Maor