views:

26

answers:

2

I'm new to database design and I considering implementing something that will be time consuming, so I wanted to ask here first if it is the best thing to do.

Facts:

  • For my purposes here, let's define a plot to be a set of x,y pairs.
  • Plots can have different numbers of x,y pairs.
  • Furthermore, for a particular plot, x,y pairs can be added or removed.
  • I will have thousands upon thousands of plots, each of which will potentially have hundreds of pairs.
  • In application I will have to regularly have to collect thousands of plots and integrate the plots over a known domain, sum the integration, and present this value to the user.

I need to design a way to store this in a database so that retrieval, modification, and storage of plots is fast.

Idea 1: Create a Plot table that holds a text list of x vals and a text list of y vals and parse them within the program. (I actually asked about this in an earlier question and this idea was summarily executed by the SO community! I think I understand why now.)

Idea 2: Create a Plot table that holds a the metadata of the plot (meaning, units). But then create a table of Pairs where each row holds an x val a y val and a foreign key that points to the corresponding row in the Plot table. When attempting to draw a Plot, I would have to query the DB for all the corresponding Pairs, and order them before I could plot them. Is there anything foolish, time-consuming, or memory-consuming here?

Idea 3: None of the above... come up with one that's better than what I've thought of so far.

Note: I'm forced to use the Microsoft SQL Express, and I'm not sure when I will hit against the wall of what SQL Express can do. Also, I kinda spoiled myself by learning Linq-to-SQL right after I started learning the fundamentals of databasing... therefore I'm really still in a object-oriented design mentality.

+1  A: 

Idea 2 is the correct way to do this. Regarding performance make sure you put indexes on the pairs so ordering and joining is quick. Also keep the datatypes AS SMALL AS POSSIBLE, you would be amazed how many times someone uses a large integer for a value they know will never be greater than 10 for example.

Simon Lee
+1  A: 

Idea 2 is fine, providing that all plots have single Y values -- one curve only. If some plots may have few curves, then it may be beneficial to borrow the concept of series from spreadsheet charting.

  • A series is a column (or row) of numbers -- vector.
  • A plot can have many series.
  • One series can be used by many plots, for example if X axis represents time intervals and many plots share the same axis.
  • Orientation in the PlotSeries table places a series on X or Y, so there can be many plots that use same data presenting series against each other.
  • OrderNo in the PlotSeriestable defines which series is plotted first and can be used to assign a default color to each curve.

alt text

Damir Sudarevic