views:

322

answers:

9

I'm creating an online dictionary and I have to use three different dictionaries for this purpose: everyday terms, chemical terms, computer terms. I have tree options:

1) Create three different tables, one table for each dictionary

2) Create one table with extra columns, i.e.:

id    term    dic_1_definition    dic_2_definition    dic_3_definition
----------------------------------------------------------------------
1     term1   definition
----------------------------------------------------------------------
2     term2                       definition
----------------------------------------------------------------------
3     term3                                           definition
----------------------------------------------------------------------
4     term4                       definition
----------------------------------------------------------------------
5     term5   definition                              definition
----------------------------------------------------------------------
etc.

3) Create one table with an extra "tag" column and tag all my terms depending on it's dictionary, i.e.:

id    term     definition    tag
------------------------------------
1     term1    definition    dic_1
2     term2    definition    dic_2
3     term3    definition    dic_3
4     term4    definition    dic_2
5     term1    definition    dic_2
etc.

A term can be related to one or more dictionaries, but have different definitions, let's say a term in everyday use can differ from the same term in IT field. That's why term1 (in my last) table can be assigned two tags - dic_1 (id 1) and dic_2 (id 5).

In future I'll add more dictionaries, so there probably will be more than three dics. I think if I'll use option 2 (with extra columns) I'll get in future one table and many many columns. I don't know if it's bad for performance or not.

Which option is the best approach in my case? Which one is faster? Why? Any suggestions and other options are greatly appreciated.

Thank you.

+1  A: 

Data Normalization .. I would go with 3, then you don't have to do any fancy queries to identify how many definitions are applicable per a given term

DRapp
+2  A: 

Option 3 sounds like to most appropriate choice for your scenario. It makes the queries a little simpler and is definitely more maintainable in the long run.

Option 2 is definitely not the way to go because you will end up with a lot of null values and writing queries against such a table will be a nightmare.

Option 1 is not bad but before your application could query it has to deceide which table to query against and that could be a problem.

So option 3 would result in simple queries like:

Select term, definition from table where tag = 'dic_1'

You may even create another tag table to keep info about the tags themselves.

Vincent Ramdhanie
Instead of using a tag, he can create a new dictionary table `(id, name)` and use the `id` in the table. Takes less memory and is faster to check and join on.
Peter Stuifzand
+6  A: 

2) Create one table with extra column

You definitely shouldn't be using the 2nd approach. What if in the future you decide that you want 10 dictionaries? You would have to create an additional 10 columns which is madness..

What you should do is create a single table for all your dictionaries, and a single table for all your terms and a single table for all your definitions, that way all your data is grouped together in a logical fashion.

Then you can create a unique ID for each of your dictionaries, which is referenced in the terms table. Then all you need is a simple query to obtain the terms for a particular dictionary.

James
+5  A: 

I think you should have a lookup table for your dictionary types

DictionaryType(DTId, DTName)

Have another Table for you terms

Terms(TermID, TermName)

Then your definitions

Difinitions(DifinitionId, TermID, Definition, DTId)

This should work.

Colour Blend
Whats the DictionaryType, your answer is the best but i dont see how that table is needed at all.
Zyris Development Team
The DictionaryType table contain all dictionary names. He said, "I'm creating an online dictionary and I have to use three different dictionaries"
Colour Blend
What if I have 3 the same terms with different definitions? Will this term have three id's or one id and 3 definitions?
Anthony
@Anthony: The term would have one record in the Terms table and thus one identifier, and three records in the Definitions table, one for each dictionary type.
Guffa
This is the way to go, if you want to be flexible with all three objects in the future. This way you can use ORM easily and add properties to all your three object types independently of the other object types.
tharkun
Makes sense blend, It didn't click earlier it is a M-M relationship.
Zyris Development Team
I disagree! I'll illustrate by way of sample data... **DictionaryType** : {1,'IT Terms'} **Terms** : {1,'DRY'}{2,'DRY'}{3,'AKO'}{4,'I'm an orphan term without a definition, but no-one noticed :('}{5, ''} **Definitions** : {1,1,'Don't Repeat Yourself'}{2,2,'Oops! Apparently its **DO** Repeat Yourself',1}{4,5,'I'm legal! I define Term 5... Now I just wish I knew its name. :S',1}{3,3,'Artificial Key Overkill - what happens when developers stop thinking',1}
Craig Young
Only terms with definitions get in the Definitions table. There a dictionary type and there are terms. Not every term may have a definition at the moment. But, you won't get into the definitions table just because of that. When you have a definition, then we'll link you up.
Colour Blend
Ok, I'll acknowledge that (although it wasn't specified) there _just might so happen to be_ a need to capture a term before the definition is available. (But one should not cater for things unless they're needed; otherwise things could get out-of-hand and you end up too many tables to manage.) This still doesn't resolve the other data anomalies that this design creates through unnecessarily excessive use of artificial keys.
Craig Young
+1  A: 

There's always an "it depends..."

Having said that, option 2 will usually be a bad choice - both from the purist perspective (Data Normalisation) and the practical perspective - you have to alter the table definition to add a new dictionary (or remove an old one)

If your main access is always going to be looking for a matching term, and the dictionary name ('everyday', 'chemical', 'geek') is an attribute, then option 3 makes sense.

If on the other hand your access is always primarily by dictionary type as well as term, and dictionary 1 is huge but rarely used, while dictionaries 2..n are small but commonly used, then option 1 might make more sense (or option 1a => 1 table for rarely used dictionaries, another for heavily used dictionaries)... this is a very hypothetical case !

Steve De Caux
+1 I agree with you. The requirements here a far too vague, resulting in the 'accepted answer' being totally over-'solved'. That said, working off the little provided; I'd go with a variation on #3.
Craig Young
+1  A: 

You want to fetch data based on the dictionary type, that means that the dictionary type is data.

Data should be in the fields of the tables, not as table names or field names. If you don't have the data in the fields, you have a data model that needs changes if the data chances, and you need to create queries dynamically to get the data.

The first option uses the dictionary type as table names.

The second option uses the dictionary type as field names.

The third option correctly places the dictionary type as data in a field.

However, the term and the tag should not be strings, they should rather be foreign keys to tables where the terms and dictionary types are defined.

Guffa
+2  A: 

I have developed similar project and my design was as follows. Storing words, definitons and dictionaries in different tables is a flexible choice especially where you will add new dictionaries in future.

alt text

eyazici
+1 Elegant and to the point.
Damir Sudarevic
+1  A: 

Your database structure should contain data, the structure itself should not be data. This rules out option 2 immediately, unless you create the different tables in order to build separate applications running on the different dictionaries. If they are being shared, then it is the wrong way to do it.

Option 1 requires a database modification and queries to be rewritten in order to accommodate addition of new dictionaries. It also adds excessive complication to simple queries, such as "what dictionaries are this word in?"

Option 3 Is the most flexible and best choice here. If your data grows too large you can eventually use DB side details like table partitioning to speed up things.

Donnie
A: 

The requirements here are far too vague, resulting in the 'accepted answer' being totally over-'solved'. The requirements need to provide more information about how the dictionaries will be used.

That said, working off the little provided; I'd go with a variation on #3.

  • Number 1 is perfectly viable if the dictionaries will be used entirely independently, and the only reason the concept of shared terms was mentioned is that it just happens to be a coincidental possibility.
  • Ditch 2; it unnecessarily leads to NULL values in columns, and DB designs don't like that.
  • Number 3 is the best, but ditch the artificial key, and key on Term + Tag. Apart from the artificial key creating the possibility of duplicate entries (by Term + Tag). If no other tables reference TermDefinitions, the key is a waste; if something does; then they say (for example) "I'm referencing TermDefinition #3... Uhhm, whatever that is. :S"

In a nutshell, nothing provided so far in the requirement indicates any need for anything more complicated than option 3.

Craig Young