tags:

views:

64

answers:

3

I have a data table with 600,000 records that is around 25 megabytes large. It is indexed by a 4 byte key.

Is there a way to find a row in such dataset quickly with PHP without resorting to MySQL?

The website in question is mostly static with minor PHP code and no database dependencies and therefore fast. I would like to add this data without having to use MySQL if possible.

In C++ I would memory map the file and do a binary search in it. Is there a way to do something similar in PHP?

+1  A: 

I would suggest memcachedb or something similar. If you are going to handle this entirely in PHP the script will have to read the entire file/datastruct for each request. It's not possible to do this in reasonable time dynamically.

matiasf
This definitely looks nice, however I'm looking for something that comes as part of standard shared linux hosting.
Ghostrider
a mysql database almost always comes standard with shared linux hosting.
Zak
If you don't want to install memcached you can look into the PHP PECL extension APC http://php.net/manual/en/book.apc.php It's a local version of Memcached. It does require you install something or at least have the libraries available to you if you're using a shared hoster.
Matt S
+1  A: 

In C++, would you stop and start the application each time a user wanted to view the file in a different way, therefore loading and unloading the file? Probably not, but that is how php is different than an application, and application programming languages.

PHP has tools to help you deal with the environment teardown/buildup. These tools are the database and/or keyed caching utilities like memcache. Use the right tool for the right job.

Zak
In Windows (and probably in Linux too) if you read the file frequently it will stay cached in memory, so mapping the file in memory would be fast and so would the lookups even if you restart the app.
Ghostrider
+1  A: 

PHP (at least 5.3) should already be optimized to use mmap if it's available and it is likely advantageous. Therefore, you can use the same strategy you say you would use with C++:

  • Open a stream with fopen
  • Move around for your binary search with fseek and fread

EDIT: actually, it seems to use mmap only in some other circumstances like file_get_contents. It shouldn't matter, but you can also try file_get_contents.

Artefacto