ansaurus

Question

Answer 1

+3 A:

The second option is still better (even with the third column, which is no problem), because it allows you to easily add new product types without modifying the tables. (The technical term for this is that the schema is better"normalized"). This is much more maintainable. In addition, it means you can join across this table much more easily, or do queries to answer "which is the lowest-cost product which product 1 is compatible with" which would be very hard to do with the first table.

njk 2010-07-23 19:42:41

I hope you don't mind another question. I already have the second table laid out but not filled yet. What I mean by this is: every product is compatible with itself, so I have it filled out for every product which is compatible with itself. I do not have any compatibilities while mixing products yet.Is there any way to quickly fill out every product in SECOND for every non-unique FIRST?I hope you understand my poor wording

thomas 2010-07-23 20:17:25

@thomas: that would probably be best done by writing code that does inserts while reading the list of existing products from... something?

FrustratedWithFormsDesigner 2010-07-23 20:34:37

@thomas, glad I could help... Please ask your second question as another question on StackOverflow, rather than in the comments :) You can post the link here once you have!

njk 2010-07-23 20:42:22

You might consider representing Compatible with a floating point value in the range 1.0 to 0.0. Then you'd be able to handle more gradations of "maybe" if you needed to in a nice 1990s fuzzy-logic kind of way. Other than that, this is the way to go. We do a lot of this kind of thing in my shop, and it performs well and isn't that hard to maintain.

Ollie Jones 2010-07-24 00:33:25

Answer 2

+2 A:

Stick with the second option. If you ever add or remove products you only affect some rows. If you use the first option and add/remove products, you are changing the structure of the table.

FrustratedWithFormsDesigner 2010-07-23 19:42:52

Answer 3

A:

There is a third option - a "semi-normalized" form. In this case, you'd have a structure something like this:

Product  DefinitelyCompatible  MaybeCompatible
----------------------------------------------
 1        '2, 3, 4'             '5'
 2        '1, 4'                '3'
 3        '1'                   '3, 4'
 4        '1, 2'                '3'
 5        ''                    '1'

Where the first field is the product ID, and the second and third fields are strings that contain lists of IDs. Using this structure, you can use the FIND_IN_SET() MySQL function to search through the lists, or if you want to look up a given product's compatibilities, you only need fetch one row and then split the strings yourself.

This allows you to have only as many rows as you have products (in a Nx3 table), while also minimizing the number of rows you need to alter to add a new product (or update an existing one).

A further note on performance - assuming that the relationships are symmetric (if a -compat-> b, then b -compat-> a), you don't even need to use FIND_IN_SET() - you can just fetch whichever object (a or b) you need to get the lists for (if you need everything that's compatible with 23, then you fetch 23's row; if you want to see if 4 is compatible with 5, you can fetch either one's row and see if the other appears in it). FIND_IN_SET() would then only be necessary if the check is a portion of a subquery, rather than in your code.

Amber 2010-07-23 19:49:28

This is most definitely not the best way to achieve this association, especially in a product database like this one. Perhaps in data warehouse scenarios this solution would be work, but njk below points out but one of many problems with this design.

WCWedin 2010-07-23 20:01:38

This might work OK, but I'd be concerned that the `DefinitelyCompatible` and `MaybeCompatible` might not index well. Whether or not that matters depends on the queries. Also, I'm not sure how portable this is to other platforms that don't have a `FIND_IN_SET()` function.

FrustratedWithFormsDesigner 2010-07-23 20:06:00

Actually, now that I think about it, it might never be necessary to index `DefinitelyCompatible` and `MaybeCompatible`. Join queries might be tricky though.

FrustratedWithFormsDesigner 2010-07-23 20:08:03

that's true, I wasn't considering queries

thomas 2010-07-23 20:10:31

Yes, a fully normalized form is almost always the best for joins. Which form is best for a given application depends on the context; some applications are far more likely to require joins than others. Not all logic is best done in the DB.

Amber 2010-07-23 20:15:36

Storing delimited data in a database is almost always a bad idea, especially for something as simple as this. What's the advantage? Fewer rows?

njk 2010-07-23 20:41:09

Sometimes, it's more efficient to just grab all of the items than it is to ask the DB engine to do a join every time. If the data model you're working with is one where you're almost always going to be asking only "what things are associated with item X", and not doing the fancier queries that normalization assists with, delimited data can be far more efficient. Sure, when you have a database engine hammer, everything looks like a join nail, but that's not always true.

Amber 2010-07-26 01:42:49

ansaurus

tags:

views:

answers:

nn table vs (n^2)3 table in mysql

related questions

ansaurus

tags:

views:

answers:

n*n table vs (n^2)*3 table in mysql

related questions

nn table vs (n^2)3 table in mysql