tags:

views:

34

answers:

3

In relation to my previous question where I was asking for some database suggestions; it just occured to me that I don't even know if what I'm trying to store there is appropriate for a database. Or should some other data storage method be used.

I have some physical models testing (let's say wind tunnel data; something similar) where for every model (M-1234) I have:

name (M-1234)  
length L  
breadth B  
height H  
L/B ratio  
L/H ratio  
...  
lot of other ratios and dimensions ...
force versus speed curve given in the form of a lot of points for x-y plotting  
...  
few other similar curves (all of them of type x-y).

Now, what I'm trying to accomplish is store that in some reasonable way, so that the user who will be using the database can come and see what are the closest ten models to L/B=2.5 (or some similar demand). Then for that, somehow get all the data of those models, including the curve data (in a plain text file format).

Is a sql database (or any other, for that matter) an appropriate way of handling something like this ? Or should I take some other approach ?

I have about a month to finish this, and in that time I have to learn enough about databases as well, so ... give your suggestions, please, bearing that in mind. Assume no previous knowledge on the subject, whatsoever.

+2  A: 

I think what you're looking for is possible. I'm using Postgresql here, but any database should work. This is my test database

CREATE TABLE test (
    id serial primary key,
    ratio double precision
);
COPY test (id, ratio) FROM stdin;
1   0.29999999999999999
2   0.40000000000000002
3   0.59999999999999998
4   0.69999999999999996
.

Then, to find the nearest values to a particular ratio

select id,ratio,abs(ratio-0.5) as score from test order by score asc limit 2;

In this case, I'm looking for the 2 nearest to 0.5

I'd probably do a datamodel where you have one table for the main data, the ratios and so on, and then a second table which holds the curve points, as I'm assuming that the curves aren't always the same size.

gorilla
+2  A: 

Yes, a database is probably the best approach for this.

A relational database (which usually uses SQL for data access) is suitable for data that is more or less structured as tables.

To give you an idea:

You could have a main table model with fields name, width etc. . Then subtable(s) for any values which can appear more than once, which refers back to model (look up "foreign key").

Then a subtable for your actual curves, again refering back to model.

How to actually model the curves in the DB I don't know, as I don't know how you model them. But if its lots of numbers, it can go into the DB.

It seems you know little about relational DBMS. Consider reading something on WIkipedia, or doing a few simple DBMS tutorials (PostgreSQL has some: http://www.postgresql.org/docs/8.4/interactive/tutorial.html , but there are many others). Then pick a DBMS for trying out (PostgreSQL is probably not a bad choice, but again there are many others).

Then try implementing a simple table schema, and get back to us with any detail questions (which you'll probably have).

One more thing: Those questions are probably more appropriate to serverfault.com.

sleske
@sleske - since you seem to be knowledgleable about these things, one more question if I may. Do you think that maybe another type of database would be more appropriate. My data is generally structured in a way that for every model I have several parameters (all scalars), so one model, one row of parameters. And for every model, I have several tables (another table) of data (curves). Would it be more appropriate maybe to try to put it in a "traditional" (not sure what is the correct name) database, like dBase (old ones). They are, if I recall correctly, different from today's sql kind.
ldigas
@Idigas: You're very welcome to post questions. But please post them as questions, that's what this site is for :-). Probably best to post on stackoverflow.com, as it's a programming question. And no, I would not generally recommend dBase these days, but opinions might vary.
sleske
Fair enough. Thanks ! (oh and btw, yes, I'm very aware of the nature of all three sites. It is just that sometimes for these details it is easier both to deal with them in the comments, than to fill up a whole board with detail questions regarding one question).
ldigas
A: 

This is arguably scientific data: you might find libraries/formats intended for arbitrary scientific data useful: HDF5 http://www.hdfgroup.org/ (note I am not an expert)

Norky
Yes, measured and predicted data. But from a field for which there are no file formats, AFAIK.
ldigas
The idea with these tools is they make it easy to develop your own format. Another one is Silo http://wci.llnl.gov/codes/silo/
Norky