views:

89

answers:

3

Background: I have a large 2D array of integers that I need to load into memory in PHP for each Apache request. I want it to take up less memory.

PHP stores ints in PHP_INT_SIZE bytes, which is 32 bits on most systems. All the integers are less than 2^16, which means they could be a short int (eg in C). Am I right in thinking that storing ints as short would take up half the RAM?

Ideally I'd like to be able to do:

$s = (short) 1234; // takes up 2 bytes instead of 4

More info:

  • The array takes up about 100mb of RAM and is generated by including a 30MB var_export() dump
  • The array is written in a cron process. Only the reading needs to be memory efficient (and quick)
  • The only operations I need to do on the integers are comparing all of them (<, >, ===) and then reading a few of them (similar to the Floyd-Warshall algorithm)
  • Reading each value from a DB is way too slow as there are a few hundred million reads per request

Some crazy ideas:

  • Use pack() / unpack() but that would still store the values as 32 bit ints when they were unpacked
  • Store the values as pixels in an image and use PHP's GD library to read them (would this be slow)
  • Use shmop_read() and have the Apache processes share the array
  • Memcached might work but I have no experience with it and I guess it would be many times slower than a native PHP array
  • Learn C++ and write a PHP extension
  • Recompile PHP (or HipHop?) to use 2 bytes for ints
  • Use Igbinary (useful, but will have same problem as pack())
+4  A: 

I would not recommend last approach. :-)

For the quick solution, I would pack 2 your integers in 1 PHP integer using this:

$big = $int1 + ($int2<<16);

And uppack as:

$int1 = $big & 65535;
$int2 = ($big>>16) & 65535;

Also, BIG thumbs up for using shared memory. This will make your APP way faster.

BarsMonster
Looks like a nice solution. The integers need to be accessed randomly so there could be some CPU overhead with this, but if it works then the memory savings should be worth it. I'll give it a go...
dave1010
+2  A: 

This isn't a task PHP has been designed for.

I recommend you write an application that has the data in-memory and does the calculations with it and then interface with it in PHP to get the results.

PHP's integer size is actually 64-bit on most 64-bit Unix-like platforms.

The shared memory is not a very good option because you still have to copy the data to PHP's memory space.

Writing an extension that keeps everything in memory and accesses it directly is possible but not very practical since you'd have to use shared memory (or some other IPC mechanism) anyway, because you typically run several PHP processes.

Artefacto
I'd like to write something that interfaces with PHP, but I wouldn't know where to start. Any pointers?
dave1010
You could use sockets or some form of [message passing](http://en.wikipedia.org/wiki/Message_passing). For Java, you can use the [PHP/Java bridge](http://php-java-bridge.sourceforge.net/).
Artefacto
Thanks. Only a portion of the array would need to be in PHP's memory space at a time, so I think shmop functions *may* work. Sharing the memory would be the ideal solution as this would scale much better.
dave1010
+1 For keeping the data in-memory
Michael Clerx
+1  A: 

i'd generate and store the array in a binary packed format and extract numbers only when you need them

function elem($n) {
    global $buf;
    return (ord($buf[$n << 1]) << 8) | ord($buf[$n << 1 | 1]);
}

$buf = file_get_contents(binary file generated by cron);
if(elem(2) > elem(10)).....

you can make it fancier by writing a class that implements ArrayAccess, so that you can simply use myPackedArray[x] instead of elem(x) in the rest of code.

stereofrog