views:

179

answers:

5

Apologies in advance if this is a silly question but I'm wondering which might be faster/better in the following simplified scenario...

I've got registered users (in a users table) and I've got countries (in a countries table) roughly as follows:

USERS TABLE: user_id (PK, INT) | country_id (FK, TINYINT) | other user-related fields...

COUNTRIES TABLE: country_id (PK, TINYINT) | country_name (VARCHAR) | other country-related fields...

Now, every time I need to display a user's country, I need to do a MySQL join. However, I often need to do lots of other joins with regard to the users and the big picture seems quite "join-heavy".

I'm wondering what the pros & cons might be of taking the countries out of the database and sticking them into a class as an array, from which I could easily retrieve them with public method calls using country_id? Would there be a speed advantage/disadvantage?

Thanks a lot.

EDIT: Thanks for the all the views, very useful. I'll pick the first answer as the accepted solution although all contributions are valued.

+1  A: 

It would ease stress on your MySQL server to have less JOIN statements, but not significantly so (there aren't that many countries in the world). However, you'll make up that time in the fact that you'll have to implement the JOIN yourself in PHP. And since you're writing it yourself, you will probably write it less efficiently than the SQL statement, which means that it will take more time. I would recommend keeping it in the SQL server, since the advantages of moving it out are so few (and if the PHP instance and the MySQL instance are on the same box, there are not real advantages).

Travis Gockel
+1  A: 

What you suggest should be faster. Granted, the join probably doesn't cost much, but looking it up in a dictionary should be just about free as far as compute power goes.

This is really just a trade off of memory for speed. The only downsides I could see would of course be the increased memory usage to store the country info and the fact that you would have to invalidate that cache if you ever update the countries table (which is probably not very often).

Eric Petroelje
+2  A: 

Do you have a serious problem performance problem now? I recently went through a performance improvement on a php/mysql website I developed for my company. Certain areas were too slow, and it turned out a lot of fault was with the queries themselves. I used timers to figure out which queries were slow, and I reorganized them (added indexes, etc). In a few cases, it was faster to make two separate queries and join them in php (I had some pretty complicated joins).

Do not try to optimize until you know you have a problem. Figure out if you have a problem first by measuring it, and then if you need to rearrange your queries you will be able to know if you made an improvement.

SorcyCat
Sure.... right now I'm speccing out a new solution that's not yet implemented, but it looks like quite a few other things will be based on it, so I'd rather build it in the right direction not to have to re-engineer fundamentally later.
Tom
Ah, well in that case, my feeling is that it would be faster to have it in mysql. It's unlikely with a list of countries that you'd ever want to do outer joins or anything really weird with the country table in your queries. You'll probably just want to JOIN on the country ID or pull the list of countries from the db. The former makes for pretty fast queries, and the latter can be sped up with caching.
SorcyCat
+1  A: 

I don't think you'd gain anything from removing the join, as you'd have to iterate over all your result rows and manually lookup the country name, which I doubt would be quicker than MySQL can do.

I also would not consider such an approach for the following reason: If you want to change the name of a country (say you've got a typo), you can do so just by updating a row in the database. But if the names of the countries are in your PHP code, you'd have to redeploy the code in order to make a change. I don't know PHP, but that might not be as straightforard than a DB change in a production system.

So for maintainability reasons, IMHO let the DB do the work.

beny23
+1  A: 

The general rule in a database world is to NORMALIZED first (results in more tables) and figure performance issues later.

You will want to DENORMALIZED only for simplicity of code, not for performance. Use indexes and stored procedures. DBMS are designed to optimize on joins.

The reason not "normalize as you go" is that you would have to modify the code you already have written most every time you modify the database design.

Yada