views:

99

answers:

7

Well this is a simple design question i've wondered about many times and never found a satisfying solution. My example is with php-sql, but this certainly applies to other languages too.

I have a small database table containing only very few entries, and that almost never needs updating. eg this usertype table:

usertype_id (primary key)  | name       | description
---------------------------+------------+-------------------
1                          | 'admin'    | 'Administrator'
2                          | 'reguser'  | 'Registered user'
3                          | 'guest'    | 'Guest'

Now in the php code, i often have to check or compare the type of user i'm dealing with. Since the user types are stored in the database, i can either:

1) Select * from the usertype table at class instantiation, and store it in an array.
Then all the ids are available to the code, and i can do a simple select to get the rows i need. This solution requires an array and a db query every time the class is instantiated.

$query = "SELECT info, foo FROM user WHERE usertype_id = ".$usertypes['admin'];

2) Use the name column to select the correct usertype_id, so we can effectively join with other tables. This is more or less equivalent to 1) but without needing to cache the whole usertype table in the php object:

$query = "SELECT info, foo FROM user JOIN usertype USING (usertype_id) WHERE usertype.name = 'admin' ";

3) Define constants that match the keys in the usertype table:

// As defines
define("USERTYPE_ADMIN",1);
define("USERTYPE_REGUSER",2);

//Or as class constants
const USERTYPE_ADMIN = 1;
const USERTYPE_REGUSER = 2;

And then do a simple select.

$query = "SELECT info, foo FROM user WHERE usertype_id = " . USERTYPE_ADMIN;

This is probably the most resource-efficient solution, but it is bad to maintain, as you have to update both the table and the code if you need to modify something in the usertype table..

4) Scrap the usertype table and only keep the types in the php code. I don't really like this because it lets any value get into the database and get assigned to the type of user. But maybe, all things considered, it isn't so bad and i'm just complicating something that should be simple..

Anyways, to sum it up the solution I like most is #2 because it's coherent and with an index on usertype.name, it can't be that bad. But what i've often ended up using is #3, for efficiency. How would you do it? Any better solutions?

Thanks,

Vincent

(edit: fixed query in #2)

A: 

Why not denormalize the DB table so instead of having usertype_id, you'd have usertype with the string type (admin). Then in PHP you can just do define('USERTYPE_ADMIN', 'admin');. It saves you from having to modify two places if you want to add a user type...

And if you're really worried about any value getting in, you could always make the column an ENUM data type, so it would self manage...

ircmaxell
This will work if the type table is going to be used by only one table (as in the example). But what if multiple tables need to use the same enum type?
Vincent
Well, then you'll need to use the normalized method. And then you're stuck either constantly querying the DB for the values. Either that, or maintaining it in two places (which isn't fun)...
ircmaxell
A: 

For tables that will contain "type" values especially when is expected such table to change over time I tend to use simple approach: Add Varchar column named hid (comes from "human readable id") with unique key. Then I fill it with id meaningful to humans like:

usertype_id (primary key)  | name       | description       | hid (unique key)
---------------------------+------------+-------------------+---------------
1                          | 'admin'    | 'Administrator'   | 'admin'
2                          | 'reguser'  | 'Registered user' | 'user'
3                          | 'guest'    | 'Guest'           | 'guest'

When you need the actual id you will have to do select based on hid column, i.e.

select usertype_id from tablename where hid = "admin"

This is not an efficient approach but it will ensure compatibility of your application among different deployments (i.e. one client may have 1.admin, 2. guest; other client 1.admin, 2. user, etc.). For your case I think #3 is pretty suitable but if you expect to have more than 10 different user roles - try the "hid" approach.

Ogre_BGR
This is what i meant with solution #2. Though i have to admit the name I've chosen for my human readable id column (`name`) isn't very intuitive.
Vincent
Oh, my bad. Got confused by the join on varchar column (not efficient I think). btw will this sql statement produce any result?!One more thing about this approach: it is good idea to limit the values that it can contain with something like:ereg("^[a-z]{1}[a-z0-9_]{0,254}$", $hid)i.e. lowercase, alphanumeric, underscore allowed.It is useful if you will not be the only person that can enter values...
Ogre_BGR
Yes the name column should be a unique index, else it would be pretty inefficient. You're right the query in #2 is wrong :S -- just fixed it, thanks!
Vincent
A: 

Are you using any kind of framework here? Could these values be stored in a single source - a config file - which both creates a list of the objects in PHP and also populates the table when you bootstrap the database? I'm thinking from a Rails perspective, as it's been a while since I've written any PHP. Solution there would probably be fixtures.

micapam
A: 

My vote is for #4. The user types mean nothing outside of the PHP script. If you're checking user types in your code by 'name', that's probably worse than using define() in terms of portability. This approach is extremely simple (no duplication) and lightweight.

Tim
Just to clarify, I am essentially referring to #3 minus the usertype table.
Tim
The user types could mean something outside of the PHP, ie. if you have to do reporting on the users. If that's the case, then you will probably still need a table containing the user types.
wimvds
A: 

Why not to make it just

foreach (getdbarr("SELECT * FROM usertype") as $row)  {
  define($row['name'],$row['id']);
}
Col. Shrapnel
A: 

You shouldn't need a JOIN in every query to fetch the information about types/roles. You can keep your 'user' model and 'role' models separate in the data access objects (DAO) -- especially since there are so few records for user types.

In most cases where I have a limited number of options that I'd otherwise be joining against a large table, I cache them in memcached as an associative array. In the event I need some information about a particular relationship (like a role) I just lazy load it.

$user = DAO_User::get(1); // this pulls a JOIN-less record
$role = $user->getRole(); // lazy-load

The code for $user->getRole() can be something like:

public function getRole() { 
  // This comes from a cache that may be called multiple 
  // times per request with no penalty (i.e. store in a registry)
  $roles = DAO_UserRoles::getAll();

  if(isset($roles[$this->role_id]))
    return $roles[$this->role_id];

  return null; // or: new Model_UserRole();
}

This also works if you want to display a list with 1000 users on it. You can simply render values for that column from a single $roles associative array.

This is a major performance improvement on the SQL end, and it goes a long way to reducing complexity in your code base. If you have several other foreign keys on the user table you can still use this approach to grab the necessary information when you need it. It also means you can have dependable Model_* classes without having to create hybrids for every possible combination of tables you might JOIN -- which is much better than simply getting a result set, iterating it, and freeing it.

Even with more than 100 rows on both sides of your JOIN, you can still use the lazy load approach for infrequent or highly redundant information. With a reasonable caching service in your code, there's no penalty for calling DAO_UserRole::get(1500) multiple times because subsequent calls during the same request shouldn't hit the database twice. In most cases you're only going to be displaying 10-25 rows per page out of 1000s, and lazy loading will save your database engine from having to JOIN all the extraneous rows before you actually need them.

The main reason to do a JOIN is if your WHERE logic requires it, or if you need to ORDER BY data from a foreign key. Treating JOINs as prohibitively expensive is a good habit to be in.

Jeff Standen
A: 

I almost always go for option 3). You could generate the code needed automatically based on what is available in the DB. The only thing you have to remember then is that you have to run the script to update/rewrite that info when you add another role (but if you're using phing or a similar build tool to deploy your apps, just add a build rule for it to your deploy script and it will always be run whenever you deploy your code :p).

wimvds