views:

44

answers:

2

I am working on a certain application. Its basic structure is as follows:

  • a database
  • a web interface to control the whole system
  • a number of client applications on remote machines

The clients poll the database at regular intervals, to check if there are any tasks to perform. A task is:

  1. A formal name of the task (a string), for example "run-regression-test"
  2. A number of arguments. For simplicity, let's assume, that those are strings. There is no need to support more, than, say, ten arguments.

If the client retrieves a task, it parses the name and the arguments and performs certain actions.

Since all of the state information is stored in the database, the tasks must be stored there as well. What I really don't know, is how to implement this as cleanly as possible. Every solution I can come up with seems almost like an anti-pattern. One could store it in a single table like this:

create table tasks (
  id int not null auto increment primary key,
  name varchar(255),
  arg1 varchar(255),
  arg2 varchar(255),
  ...
)

I also thought of doing it in using two tables:

create table tasks (
  id int not null auto_increment primary key,
  name varchar(255)
)

create table task_args (
  id int not null auto_increment primary key,
  task_id int,
  arg varchar(255)
)

This is a bit more flexible and easy to implement as well. However, it is impossible to tell which argument out of a number of argumens is first, which is second, etc (can be fixed by adding a "sequence" column to the task_args table).

I am sure there are better and cleaner ways to do what I want. Perhaps someone with better skills in app-design can give me some inspiration.

Thanks in advance.

+2  A: 

Based on the auto_increment column, I guess your target database is MySQL. In PostgreSQL I'd probably use an array of varchars for this. It's more practical for things like this where the values in the list are not foreign keys or you don't need to query them. In MySQL you can't do that, and using argX columns or serializing the array yourself is just wrong, so I'd use this:

create table tasks (
    id int not null auto_increment primary key,
    name varchar(255)
);

create table task_args (
   task_id int not null,
   no int not null,
   arg varchar(255),
   primary key (task_id, no)
);

To get all arguments for a task you would run SELECT arg FROM task_args WHERE task_id=? ORDER BY no.

Lukáš Lalinský
Really, what a shame. Why did I have to make a mess out of my second model (no need for a separate pk id column in the args table)? I'll never know. Yes, the DBMS is, indeed, mysql.
shylent
You do need a separate PK if the rows are not ordered like in your version. My version will have rows such as (1, 1, 'arg1'), (1, 2, 'arg2'), etc. The `task_id` and `no` columns always have to be unique, so you can just as well use them as the primary key (you don't need an additional index on `task_id`, which you would need otherwise).
Lukáš Lalinský
+1  A: 

I would say that it really depends on how the client is going to parse the information and if you would prefer to make the clients more simple.

For instance, if you want the clients to be simple, I would suggest your second design and add the sequence column. Then, you client doesn't have to attempt to parse the data, it already knows that it is looking at an integer (or can ask simply enough).

However, if the clients are already fairly smart, or they are written in a language that makes it difficult to interface with the database (C perhaps) then I might just store the entire "command task" as a string and have the clients parse it as tokens.

There are obvious downsides to both depending on the language, but I think your ultimate decision boils down to where you want to put the majority of the logic.

Topher Fangio
Good point. The clients are going to be written in perl or python, I think. At least I am sure, that interacting with the database will not be a problem, so no need to store the tasks as tokenized strings.
shylent