views:

43

answers:

4

Assume A and B are conceptually different entities that have a few similar properties and of course their own specific properties. In database design, should I put those two entities in one big aggregated table or two respectively designed tables.

For instance, I have two types of payment; Online-payment and Manual-payment with following definition,

TABLE [OnlinePayments]
( 
    [ID] [uniqueidentifier], 
    [UserID] [uniqueidentifier], 
    [TrackingCode] [nvarchar](32), 
    [ReferingCode] [nvarchar](32),
    [BankingAccID] [uniqueidentifier],
    [Status] [int],
    [Amount] [money],
    [Comments] [nvarchar](768),
    [CreatedAt] [datetime],
    [ShopingCartID] [uniqueidentifier],
)

And

TABLE [ManualPayments]
(
 [ID] [uniqueidentifier],
 [UserID] [uniqueidentifier],
 [BankingAccID] [uniqueidentifier],
 [BankingOrgID] [uniqueidentifier],
 [BranchName] [nvarchar](64),
 [BranchCode] [nvarchar](16),
 [Amount] [money],
 [SlipNumber] [nvarchar](64),
 [SlipImage] [image],
 [PaidAt] [datetime],
 [Comments] [nvarchar](768),
 [CreatedAt] [datetime],
 [IsApproved] [bit],
 [ApprovedByID] [uniqueidentifier],
)

One of my friends told me that creating two distinct tables for such similar entities is not a well design method and they should be put in one single table for the sake of performance and ease of data manipulations.

I’m now wondering what to do? What is the best practice in such a case?

A: 

I might actually make three tables -- a Payments one with the common fields and nullable foreign key referencing the other two, and two specific ones with foreign keys referencing back to Payments. With the proper indices and a good relational engine (e.g., PostgreSQL), I've always found this "inheritance-like (with backpointers)" hybrid approach quite serviceable. If you're stuck with an inferior engine, e.g. one with an optimizer unable to easily meld multiple indices in one search, then I guess that the single-table approach (holding your nose when you're forced to use it;-) might be usable. I don't see what the two-tables compromise is buying you over either of these solutions -- the "pure" three-tables one, or the "practicality beats purity" single-table one; care to expand on this point?

Alex Martelli
Using SQL Server and Linq-to-Sql, I found that inheritance implementation is not easy with 3 tables as you mentioned. Is it really bad idea having two tables in such a case?
Haghpanah
I don't know nor use Linq -- and if it forces you into inferior Sql, I'm not tempted to;-). Is Linq the only answer to my question on what the two-table approach is buying you?
Alex Martelli
A: 

split..split..split..split...


Since you do have ID as a common tag (and ID is unique for each transaction) :

Have TABLE1 ( ID, USERID, AMOUNT, Comments, CreatedAt, *TYPE* )

i.e. the common fields Use the TYPE field to store if it is OnlinePayment or ManualPayment

Store TABLE2 as follows:

TABLE [OnlinePayments] ( 

[ID] [uniqueidentifier],

[UserID] [uniqueidentifier],

[TrackingCode] [nvarchar](32),

[ReferingCode] [nvarchar](32),

[BankingAccID] [uniqueidentifier],

[Status] [int],

[ShopingCartID] [uniqueidentifier], )

Store TABLE3 as follows:

TABLE [ManualPayments] (

[ID][uniqueidentifier],

[UserID] [uniqueidentifier],

[BankingAccID] [uniqueidentifier],

[BankingOrgID] [uniqueidentifier],

[BranchName] [nvarchar](64),

[BranchCode] [nvarchar](16),

[SlipNumber] [nvarchar](64),

[SlipImage] [image],

[PaidAt] [datetime],

[IsApproved] [bit],

[ApprovedByID] [uniqueidentifier], )

Use the unique - ID to map the rows between table1-table2 & table1-table3

This way the memory consumption ought to be reduced to the bare minimum. This will in-turn speed-up your queries. Also performance ought NOT to degrade too much as in other cases.

Also, Plz go through this:

GoodLUCK!!

CVS-2600Hertz
+3  A: 

Your question begins with an interesting word: "conceptually". Conceptually - logically - all payments are basically the same. Most of the time your application is probably just interested in payments - has the customer paid? has the payment been processed? do we need to refund a payment?

So, conceptually, your application probably needs a PAYMENTS table. But, you have two different types of payment, and you might have additional types in the future (direct debit, standing orders). You can represent these sub-types in three ways:

  1. add a PAYMENT_TYPE column and lots of optional columns to the PAYMENTS table.
  2. implement separate child tables for the different payment types, holding just the sub-type columns.
  3. what you currently have - do away with the PAYMENTS table and duplicate the common columns in each sub-type tables.

From experience I have found the single table to be an absolute nightmare to code against. We might get away with it when we only have two sub-types, but the logic gets gnarlier the more sub-types we add. It also means we cannot reference the sub-types in foreign keys.

The "no common table" solution is - to my mind - preferable to a single table. Its major drawback is that it makes it harder to work with the generic entity. We cannot enforce a foreign key against PAYMENTS. It also helps to have a database which supports sharing a UID across tables (such as Oracle's sequence), and which supports the UNION ALL operator.

The super-type plus sub-types solution is definitely the best solution. It allows us to deal easier with generic PAYMENTS and with specific kinds of payment. We get an intuitive data model, we get the ability to enforce foreign keys at the appropriate granularity, we get something that fits the way our application works.

However, you are using LINQ-to-SQL which currently does not support the super-type/sub-type implementation. It annoys me immensely to have a data model driven by the limitations of a ORM tool. Other applications which share the database have to pay the cost of a sub-standard model even though they don't use the ORM tool in question (and it is pretty common for several clients to access one database).

Anyway ....

Keep with your current design. It is simpler and cleaner. As for performance, while there may be scenarios where a single table is faster there will be other scenarios where the separate tables will perform better. Besides, for those times when you do need to handle generic payments, SQL Server supports UNION ALL.

APC
A: 

I agree with the above - 3 tables is the way to go. Performance? It will actually be better!

3 tables will give you an opportunity to fine tune your queries and indexes. It's better to have 1 index for each table than 3 indexes for one table - in case of OLTP.

Perceived "join" price, is only applicable to inferior database optimizers and are not applicable to MSSQL and Oracle, unless you go over board.

IMHO