views:

37

answers:

3

Hey,

I'm developing a job service that has features like radial search, full-text search, the ability to do full-text search + disable certain job listings (such as un-checking a textbox and no longer returning full-time jobs).

The developer who is working on Sphinx wants the database information to all be stored as intergers with a key (so under the table "Job Type" values might be stored such as 1="part-time" and 2="full-time")... whereas the other developers want to keep the database as strings (so under the table "Job Type" it says "part-time" or "full-time".

Is there a reason to keep the database as ints? Or should strings be fine?

Thanks!

Walker

+3  A: 

Choosing your key can have a dramatic performance impact. Whenever possible, use ints instead of strings. This is called using a "surrogate key", where the key presents a unique and quick way to find the data, rather than the data standing on it's own.

String comparisons are resource intensive, potentially orders of magnitude worse than comparing numbers.

You can drive your UI off off the surrogate key, but show another column (such as job_type). This way, when you hit the database you pass the int in, and avoid looking through to the table to find a row with a matching string.

When it comes to joining tables in the database, they will run much faster if you have int's or another number as your primary keys.

Edit: In the specific case you have mentioned, if you only have two options for what your field may be, and it's unlikely to change, you may want to look into something like a bit field, and you could name it IsFullTime. A bit or boolean field holds a 1 or a 0, and nothing else, and typically isn't related to another field.

Ziplin
So you're recommending we have both the int key AND in the same row a field with the equivalent string (part-time etc.)?
Walker
Yes, so whenever you look in your JobTypes table, you'll see a row like: (ID = 1, Name="part-time") And in your Jobs table (I'm making some assumptions about what you're doing here), you could have something like this: (Id=324, CompanyId=32, JobTypeId=1, Name="Intern"), you may want to look into bit fields or something like that for the specific instance of JobType, note I am updating my answer to reflect this. Sorry for the crapping formatting
Ziplin
+1  A: 

if you are normalizing your structure (i hope you are) then numeric keys will be most efficient.

Randy
+1  A: 

Aside from the usual reasons to use integer primary keys, the use of integers with Sphinx is essential, as the result set returned by a successful Sphinx search is a list of document IDs associated with the matched items. These IDs are then used to extract the relevant data from the database. Sphinx does not return rows from the database directly.

For more details, see the Sphinx manual, especially 3.5. Restrictions on the source data.

Mike
Thanks Mike, I had no idea - I appreciate the insight.
Walker