views:

38

answers:

2

I have a mySQL table which represents a company's products. The table shows whether or not two products are compatible with each other.

The table looks somewhat like

Product1    Product2    Compatible?
A           A           Yes
A           B           ?
A           C           No
A           D           ?
B           A           ?
B           B           Yes
B           C           ?
B           D           Maybe
C           A           No
C           B           ?
C           C           Yes
C           D           ?
D           A           ?
D           B           Maybe
D           C           ?
D           D           Yes

Note that while none of the rows are duplicates, some data is redundant. If A is not compatible with C, then C is obviously not compatible with A. This renders the one row redundant. The reason why I have these rows to begin with is that I built the table using a nested for loop. Would you recommend deleting the rows with duplicate meaning for the sake of saving space? Or leave them there for (possibly?) easier maintenance?

+1  A: 

Space is cheap ... there is no need to delete data in today's world. However, that doesn't mean we can't be efficient. If I was coming at this as a database design problem then I would create two tables.

One for products

and

One for compatibility, which you have done.

But in the example above you do not give a reason why you are tracking non-compatibility. If the products are in the compatibility table then that means the are compatible ... if they are not then well that means they are not compatible.

How are you populating these rows ... you never give a reason why you add a row for A to C, but then you also add a row for C to A. Why add the second row at all?

In your table what is the exact data in the table columns for product A and B? product id? product name?

Ryan
When creating a compatibility table you still would get doubles; even with a unique index on the primary key it would still be posible to create (A,C) and (C,A) pairs.In that case the solution would be to create a trigger before_insert to validate that the pair also is unique when tried the other way around
Mark
I have only populated the first two columns using a for loop, I have not populated the third column at all yet. There are a lot more products then the example I tried making, I simplified it down a bit
thomas
The products still go in the compatibility table when they aren't compatible because it is not simply a binary yes/no. There are some that are "still testing".
thomas
That makes more sense then. Like Mark said you will have to create some sort of validation on before you save to the database to determine if there is redundant data. What kind of application is this? Is it web-based or a desktop application?To follow that, what language are you using to populate the database or write your application in?
Ryan
A: 

It is almost always worthwhile to properly normalize your data. Lower volume of data and no chance of inconsistency are just the two most obvious reasons why. So if your compatibility is in fact guaranteed to be symmetric, and remain symmetric forever (and not some kind of upward- vs. backward-compatibility...), then yes, you should delete the redundant rows.

The only caveat is that in the future you must either query the compatibility in the canonical order (with the lower product, however you define that, in the first slot of your query), or use a disjunctive query, otherwise you might miss a legitimate combination. (The first of those options is obviously the better solution, since the second reintroduces unnecessary processing effort.)

Kilian Foth
What if the user were to input which two products they wanted to check. I have no control over if they choose ' "A" and "B" ' rather than ' "B" and "A" '
thomas
Then your business code needs to sort the two products before firing off the query. I mean, it's not like you're generating database queries directly from your TextInput widgets, right? <sound of chirping crickets>Er... right?
Kilian Foth
I haven't made it yet (so no, it is not generating straight from the textinput), but that's a good point.Thanks, I'm fairly new to mySQL and just trying to make sure everything I do is kosher
thomas
Is there any way to automate this process?
thomas