views:

112

answers:

7

I've a web app built in PHP with Zend on a LAMP stack. I have a list of 4000 words I need to load into memory. The words have categories and other attributes, and I need to load the whole collection every time. Think of a dictionary object.

What's the best way to store it for quick recall? A flat file with something like XML, JSON or a serialized object? A database record with a big chunk of XML, JSON or a serialized object? Or 4000 records in a database table?

I realize different server configs will make a difference, but assume an out-of-the-box shared hosting plan, or WAMP locally or some other simple setup.

+3  A: 

If you're using APC (or similar), your fastest result is probably going to be coding the word list directly into a PHP source file and then just require_once()'ing it.

Alex Howansky
+1  A: 

I'd place it in a file that can be cached, saving you a lot of unnecessary database calls on a (or maybe even every?) page load. How you store it doesn't really matter, whatever works best for you. Speed-wise, 4000 words shouldn't be a problem at all.

For translations in projects I work on I always use language files containing serialized php-data which is simply easy to retrieve:

$text = unserialize(file_get_contents('/language/en.phpdata'));
Alec
A: 

If you're going to serialise the list of words as XML/JSON anyway, then just use a file. I think a more natural approach is to include the list in the PHP source though.

If that list is going to change, you will have more flexibility with a database.

Tim McNamara
A: 

If you need all 4000 in memory all the time, that defeats the purpose of querying a database, although I could be wrong. Serialized object sounds simple enough and I would think it would perform alright on that number of words.

Gabriel
+1  A: 

In an ideal system I would say memory (memcached), disk and database. But depending on setup the database could be on several occasions faster than disk because the result could stick in the query cache.

It all depends on the environment; and if it's that critical, you should measure it. Otherwise place it where you think it is more accessible.

mhitza
+1  A: 

Format the list as a PHP source and include it.

Failing that, ask yourself if it really matters how fast this will load. 4000 words isn't all that many.

Mark Ransom
A: 

If you just need to know which one is faster, I'm going with the DB. In addition to the speed, using the DB is safer and easier to use. But, be careful to use a proper data type, like ntext (MS-SQL server) or BLOB (oracle).

Orod Semsarzadeh
And this is supported by what evidence?
NullUserException