views:

96

answers:

3

If I have a table structure like this:

Transaction [TransID, ...]

Document [DocID, TransID, ...]

Signer [SignerID, ...]

Signature [SigID, DocID, SignerID, ...]

And the business logic is like this:

  • Transactions can have multiple documents
  • Documents can have multiple signatures
  • And the same signer can have multiple signatures in multiple documents within the same transaction

So, now to my actual question:

If I wanted to find all the documents in a particular transaction, would it be better, performance-wise, if I also stored the TransID and the DocID in the Signer table as well so I have smaller joins. Otherwise, I'd have to join through the Signature > Document > Transaction > Documents to get all the documents in the transaction for that signer.

I think it's really messy to have that many relationships in the Signer table though and it doesn't seem "correct" to do it that way (also seems like an update nightmare) but I can see that it might be better performance for direct joins. Thoughts?

TIA!

+10  A: 

Go with the normalized version. Only reconsider if performance becomes an issue. The other option is a maintenance hazard.

spender
+1: "also seems like an update nightmare". Close. It is an update nightmare. Indeed, we prevent the update nightmare by normalizing
S.Lott
+1 - Normalization is the key. Also, extra credit for advising against premature optimization!
Mark Brittingham
excellent - thanks everyone!
EdenMachine
+4  A: 

Store them as spender suggested in a normalized table. Add indexes to retrieve your data and see what your performance is like. Although you have multiple-to-multiple relationships, look at them as a percentage of the total.

Josef Richberg
+1 for mention indices - that is the correct way to first think about speeding up queries - not using a denormalized structure. Plus, I'm happy to give you your first upvote!
Mark Brittingham
+1  A: 

I'll also point out that storing the transactionid in the signer table won't work as signers are likely to be involved in multiple transactions.

This is such a little, tiny, uncomplicated query that it is unlikely the joins will be problem as long as they are properly indexed.

HLGEM
In it's current form, signers will "expire" after the transaction is completed or cancelled so they won't be involved in multiple transactions. However, that is the plan for the future so one day it would be relevant so +1 for that anyway.
EdenMachine