views:

151

answers:

1

I have the following problem to solve:

Let's say there is a table A that contains an number of elements, this table is copied to become table B. In table B some of the original elements are lost and some are added. A reference table AB keeps track of these changes. Then table B is copied to be table C and again some of the existing elements get lost and some are added. Another reference Table BC keeps track of these relations. ... etc.

There is an n number of such tables with an n-1 number of reference tables.

If I want to know which of the elements of my choice in table C where already present in A, I can do that by doing something like:

SELECT AB.oldID
FROM AB
JOIN BC
WHERE BC.newID IN (x, y, z)

Now since the number of reference tables can wary, the number of JOIN lines can wary.

Should I concatenate the query by looping over the steps and adding JOIN lines or shoudl I rather write a recursive function that selects only the members of the next step and then let the function call itself until I have the end result?

Or is there an other even better way to do something like that?

+1  A: 

Since your table names vary, you'll need to build some kind of a dynamical query.

If you do the recursive function approach, you'll need to pass the resultsets between the function calls somehow.

MySQL has no array datatype, and storing the results in a temp table is way too long.

Conclusion: use joins.

Update:

Here's a sample query which returns the entries that persisted through revision A to revision M (with one table design):

SELECT  *
FROM    entries e
WHERE   NOT EXISTS
        (
        SELECT  *
        FROM    revisions r
        JOIN    revision_changes rc
        ON      rc.revision_id = r.id
        WHERE   rc.entry_id = e.id
                AND rc.deleted
                AND r.revision_id BETWEEN 'A' AND 'M'
        )

This way, you just fill the added and deleted fields of revision_changes for the revisions where the entry was added or deleted.

Quassnoi
Thanks. Do you mean NO array datatype? You're the DB expert, is it often done, such concatenation of queries?
tharkun
No, usually one table is used with an additional column to store the version identifiers.
Quassnoi
Ok, I think I have to rethink my whole layout! Thanks for the brain food.
tharkun
[ *Looks at the mirror* ] What I'm turning into? I'm already half human, half database expert!
Quassnoi
Ever since I've started using SO you've been around answering complex DB questions with ease and faster than Jon Skeet allows. Expertise is needed to do that... as your mirror tells you.
tharkun
Faster than `Jon Skeet`? Really? Wow!
Quassnoi
Yep. * (* required - at least 15 characters)
tharkun
I've seen Jon Skeet comment that he's not a database expert. No problem, everyone has their strengths. StackOverflow experts can -- as a group -- answer anything! :-)
Bill Karwin
:) SO... still a marvel after all these months ;)
tharkun
@Quassnoi: That's what I meant... amazing (the query)! You may modestly state that I'm easy to amaze, which is probably true so I'm not entitled to estimate the validity of my own amazement. But then again that must be true for all technical amazement.
tharkun
The task before me is writing a migration script for all the tables that exist alreay in production using the multi-table system.
tharkun
`@tharkun`: Then just build the joins dynamically: `AB JOIN BC JOIN CD` etc. Will be much more efficient that returning the resultsets into `PHP` and filtering.
Quassnoi
@Quassnoi: Thanks! I will do that first for the minor release and then refactor the whole thing and have a proper revision approach in one table as you suggest for the next major release. It will be cool to learn that approach.
tharkun