views:

37

answers:

2

hi guys

i m designing a database using sql server 2005

main concept of our side is to import xml feeds from suppliers

different supplier can have different representation of data

the problem is i need to design table to store imported information

some of the columns are fixed means all supplier products must have similar data coming from the feed like , name, code, price, status, etc

but some product have optional details like

one product have might color property other might dont.

what is the best way to store these kind of scenario into the database.

should i create a table for mandatory columns and other tables to hold optional column.

or i should i list down all the column first and put them into the one table. (there might a lot of null values)

there will thousands of products and database speed is very essential .

we will be doing a lot of product comparison from different supplier

our database will be something like www.pricerunner.co.uk

i hope i explain the concept well

+1  A: 

Depends on how you want to access it.

As you say, speed is important - but what are you going t do with those extra, optional, bits of information? Do you need to store them at all? Assuming you do, how often do you need to access them?

Essentially, if you will always need to at least check if they're there, probably better to put them into one table. If you need to check anyway, might as well get it over with as part of the initial query.

If, on the other hand, you can usually run without bothering to check for these extra pieces, and only need to bother when specilly requested, then it might be better to put them into a different table. The join (or subsequent lookup) will be expensive - much more expensive than pulling nulls for empty columns - but if it's very infrequent, would probably cost less in runtime execution in the long run.

Also bear in mind the tradeoff in storage and transport terms - storing lots of empty fields does take some space, and sending back lots of empty fields takes network bandwidth.

If disk space is not a concern, but bandwidth is, make the application is carfully designed to minimse unecessary lookups, and then with tight queries you can store the extra (optional) data, but not pass it back unless it's requested.

So, it really all depends on what's important to you. Once you know what your overriding design concerns are, you will know which compromises to make to address those concerns at the expense of others. A balancing act.

Ragster
+1  A: 

Thousands of products (so thousands of rows.) Thats really not many at all, so you could normalize the the optional data to a few separate tables without having a dramatic effect on query time.

I would say put your indexes in the correct place, optimize your queries, make sure you have filegroups split up nicely, etc (just the usual regular old database stuff) and you should be good.

Meiscooldude
This is true - a few thousand records and the time difference would be hard to measure. Scale it up to a few million and it would make a huge difference. Taking scale into account is important too.
Ragster