Where do you put your dictionary data?

views:

224

answers:

+4 Q:

Where do you put your dictionary data?

Let's say I have a set of Countries in my application. I expect this data to change but not very often. In other words, I do not look at this set as an operational data (I would not provide CRUD operations for Country, for example).

That said I have to store this data somewhere. I see two ways to do that:

Database driven. Create and populate a Country table. Provide some sort of DAO to access it (findById() ?). This way client code will have to know Id of a country (which also can be a name or ISO code). On the application side I will have a class Country.
Application driven. Create an Enum where I can list all the Countries known to my system. It will be stored in DB as well, but the difference would be that now client code does not have to have lookup method (findById, findByName, etc) and hardcode Id, names or ISO codes. It will reference particular country directly.

I lean towards second solution for several reasons. How do you do this?

Is this correct to call this 'dictionary data'?

Addendum: One of the main problems here is that if I have a lookup method like findByName("Czechoslovakia") then after 1992 this will return nothing. I do not know how the client code will react on it (after all it sorta expects always get the Country back, because, well, it is a dictionary data). It gets even worse if I have something like findById(ID_CZ). It will be really hard to find all these dependencies.

If I will remove Country.Czechoslovakia from my enum, I will force myself to take care of any dependency on Czechoslovakia.

+1 A:

If it's not going to change very often and you can afford to bring the application down to apply updates, I'd place it in a Java enumeration and write my own methods for findById(), findByName() and so on.

Advantages:

Fast - no DB access for invariant data (or caching requirement);
Simple;
Plays nice with refactoring tools.

Disadvantages:

Need to bring down the application to update.

If you place the data in its own jarfile, updating is as simple as updating the jar and restarting the application.

The hardcoding concern can be made to go away either by consumers storing a value of the enumeration itself, or by referencing the ISO code which is unlikely to change for countries...

If you're worried about keeping this enumeration "in synch" with the database, write an integration test that checks exactly that and run it regularly (eg: on your CI machine).

Dan Vinton 2009-02-25 13:54:03

For me the part of making it an enum is that I will not need findById(), findByName(). You do not have to do findByName("USA") you have it Country.USA. Ids are also not needed here the way I see it.

Georgy Bolyuba 2009-02-25 13:56:58

That's true if your object's member field is of that enum type, but sometimes that's not the case - for example, a third party component may represent countries as ISO codes which you want to convert into your enum.

Dan Vinton 2009-02-25 14:03:16

Agree, but here I have a rare opportunity to "reinvent the wheel" I guess. Just lloking for a best way to do it. Both, client code and Country implementation is under our control.

Georgy Bolyuba 2009-02-25 14:08:07

+1 A:

Personally, I've always gone for the database approach, mostly because I'm already storing other information in the database so writing another DAO is easy.

But another approach might be to store it in a properties file in the jar? I've never done it that way in Java, but it seems to be common in iPhone development (something I'm currently learning).

Paul Tomblin 2009-02-25 13:54:37

No need to be a properties file - any kind of text file you can parse easily will do.

Jon Skeet 2009-02-25 13:56:13

How efficient are the lookup methods for properties files? Is there an easy way to turn a properties file into a Map? In ObjC, it's two lines of code or so.

Paul Tomblin 2009-02-25 14:47:14

+1 A:

I'd probably have a text file embedded into my jar. I'd load it into memory on start-up (or on first use.) At that point:

It's easy to change (even by someone with no programming knowledge)
It's easy to update even without full redeployment - put just the text file somewhere on the class path
No database access required

EDIT: Okay, if you need to refer to the particular country data from code, then either:

Use the enum approach, which will always mean redeployment
Use the above approach, but keep an enum of country IDs and then have a unit test to make sure that each ID is mapped in the text file. That means you could change the rest of the data without redeployment, and a non-technical person can still update the data without seeing scary code everywhere.

Ultimately it's a case of balancing pros and cons - if the advantages above aren't relevant for you (e.g. there'll always be a coder on hand, and deployment isn't an issue) then an enum makes sense.

Jon Skeet 2009-02-25 13:55:47

The central problem here is not the persistence of the data, but rather lookup methods (findByName("USA")) vs dependency on particular instance of the Enum (Country.USA)

Georgy Bolyuba 2009-02-25 14:05:47

+2 A:

In some applications I've worked on there has been a single 'Enum' table in the database that contained all of this type of data. It simply consisted of two columns: EnumName and Value, and would be populated like this:

"Country", "Germany"
"Country", "United Kingdom"
"Country", "United States"
"Fruit", "Apple"
"Fruit", "Banana"
"Fruit", "Orange"

This was then read in and cached at the beginning of the application execution. The advantages being that we weren't using dozens of database tables for each distinct enumeration type; and we didn't have to recompile anything if we needed to alter the data.

This could easily be extended to include extra columns, e.g. to specify a default sort order or alternative IDs.

saw-lau 2009-02-25 14:00:44

This solution might be interesting for Enums that are changed often. Country enum is likely to stay stable for a long time.

Georgy Bolyuba 2009-02-25 14:34:45

+1 A:

This won't help you, but it depends...

-What are you going to do with those countries ?

Will you store them in other tables in the DB / what will happen with existing data if you add new countries / will other applications access to those datas ?

-Are you going to translate the contry names in several languages ?

-Will the business logic of your application depend on the choosen country ? -Do you need a Country class ?

etc...

Without more informations I would start with an Enum with a few countries and refactor depending on my needs...

pgras 2009-02-25 14:06:48

I'd start off doing the easiest thing possible - an enum. When it comes to the point that countries change almost as frequently as my code, then I'd make the table external so that it can be updated without a rebuild. But note when you make it external you add a whole can of UI, testing and documentation worms.

Tom Hawtin - tackline 2009-02-25 14:11:18

I agree with starting from the simple solution except I think it will be hard to convert client code form one approach to another.

Georgy Bolyuba 2009-02-25 15:22:26

+1 A:

One of the advantages of using a database table is you can put foreign key constraints in. That way your referential integrity will always be intact. No need to run integration tests as DanVinton suggested for enums, it will never get out of sync.

I also wouldn't try making a general enum table as saw-lau suggested, mainly because you lose clean foreign key constraints, which is the main advantage of having them in the DB in the first place (might was well stick them in a text file). Databases are good at handling lots of tables. Prefix the table names with "ENUM_" if you want to distinguish them in some fashion.

The app can always load them into a Map as start-up time or when triggered by a reload event.

EDIT: From comments, "Of course I will use foreign key constraints in my DB. But it can be done with or without using enums on app side"

Ah, I missed that bit while reading the second bullet point in your question. However I still say it is better to load them into a Map, mainly based on DRY. Otherwise, when whoever has to maintain it comes to add a new country, they're surely going to update in one place but not the other, and be scratching their heads until they figure out that they needed to update it in two different places. A case of premature optimisation. The performance benefit would be minimal, at the cost of less maintainable code, IMHO.

Evan 2009-02-25 14:24:02

Of course I will use foreign key constraits in my DB. But it can be done with or without using enums on app side.

Georgy Bolyuba 2009-02-25 14:31:29

I've updated my answer.

Evan 2009-02-25 15:08:58

Well, to follow DRY principle, I would not provide any initialization script for Country table and will force it to be created on startup (using JPA with Hibernate, for example).

Georgy Bolyuba 2009-02-25 15:19:14

Not familiar with JPA. If it creates it the first time the app starts, then okay I guess. You wouldn't want to be dropping and recreating the table and constraints all the time.

Evan 2009-02-25 15:56:15

ansaurus

tags:

views:

answers:

Where do you put your dictionary data?

related questions