views:

173

answers:

3

Based on the information I have provided below, can you give me your opinion on whether its a good idea to denormalize separate tables into one table which holds different types of transactions?.. What are the pro's/con's?.. Has anyone attempted this before?.. Banking systems use a CIF (Customer Information File) [master] where customers may have different types of accounts, CD's, mortgages, etc. and use transaction codes[types] but do they store them in one table?.. The app is single-user with max of 10TPHour. There's no high TPM's in this app, it's not a bank, it's a...

Pawnshop Application:

I have separate tables for Loan, Purchase, Inventory & Sales transactions. Rows from each of these tables are joined to their corresponding customer in the following manner:

customer.pk_id SERIAL = loan.fk_id      INTEGER; 
                      = purchase.fk_id  INTEGER; 
                      = inventory.fk_id INTEGER; 
                      = sale.fk_id      INTEGER;  

Since there are so many common properties within the four tables, which revolves around the same merchandise being: pawned, bought, transferred to inventory and sold, I experimented by consolidating the four tables into one table called "transaction" and added the following column:

transaction.trx_type char(1) {L=Loan, P=Purchase, I=Inventory, S=Sale}

Scenario:

A customer initially pawns merchandise, makes a couple of interest payments, then decides he wants to sell the merchandise to the pawnshop, who then places merchandise in Inventory and eventually sells it to another customer.

I designed a generic transaction table where for example:

transaction.main_amount DECIMAL(7,2) 

in a loan transaction holds the pawn amount, in a purchase holds the purchase price, in inventory and sale holds sale price.

This is clearly a denormalized design, but has made programming a lot easier and improved performance. Any type of transaction can now be performed from within one quantum-style screen, without the need to change to different tables. The key is that the same merchandise navigates through different types of transactions over time.

As the merchandise morphs into different transaction types, I'm storing its history of all previous transactions performed with it, from the time it first made its appearance at the pawnshop through the sale of it to another customer, which could again make its way back into the pawnshop. When merchandise is pawned, it is micro-tagged with an invisible id number (the original ticket number). Example of multiple transactions: Customer A pawns it, makes couple of interest payments, decides he wants to sell it to pawnshop, pawnshop buys it, puts it in inventory, sells it to Customer B, who after a couple of months needs money and pawns it at same pawnshop and ... each transaction gets recorded with a unique serial transaction number and several fields are re-utilized depending on what type of transaction is taking place.

So, several columns are used for multiple purposes, depending on transaction type. Money columns could be credit, debit or suspense depending on tran type.

If anyone has developed a similar type app, I would be interested in hearing from you. Perhaps I could obtain some valuable input into doing this the best way possible, Thank You!

I WOULD LIKE TO HEAR FROM ANYONE WHO HAS ENCOUNTERED A SIMILAR SITUATION WHERE THEY HAVE CONSOLIDATED SEPARATE TABLES INTO ONE TABLE.

+1  A: 

I would argue that it's not denormalized. I see no repeating groups; all attributes depending on a unique primary key. Sounds like a good design to me.

Is there a question here? Are you just looking for confirmation that it's acceptable?

Hypothetically, what would you do if the overwhelming consensus was that you should revert back to separate tables for each type? Would you ignore the evidence from your own experience (better performance, simpler programming) and follow the advice of Stackoverflow.com?

duffymo
@duffymo- 1. Converting four tables into one is not denormalization? 2. Looking for confirmation. 3.Ignore the evidence because my design relates more to the realworld scenario I described in "Scenario", improves performance and simplifies program logic.
Frank Computer
@Frank - I'm not sure. If I had home address, mailing address, shipping address, and billing address in four separate tables, and I decided to have a single address table with an extra column to say how it was used, would that be considered denormalization? I'll leave that to someone more expert than me.
duffymo
A: 

As far as I know, this is not normalized if that is the actual question (it isn't clear). Specifically, you are violating first normal form because it is not the case that your table represents a single "relation" or entity and that there exists some attributes not applicable to some tuples. I.e., I highly doubt that loans, purchases, inventory and sales all have exactly the same attributes. I would bet that you have columns in your table that only apply to loans or only apply to sales and so on.

So, given that you have accepted that it is denormalized, what could go wrong?

  • It will be more difficult to create dependencies. For example, if it is the case that say a purchase is associated with an inventory item or a loan there is no easy means to do that. You would need to add another column that only relates to some types of data.

  • It will be harder to enforce data integrity. For example, if loans cannot have negative amounts but purchases can because of discounts or refunds, it will be cumbersome to enforce that.

  • You will create a hot spot for data. For example, if there are lots of loan transactions but only a few inventory transactions, queries on inventory transactions will be needlessly slow because of the traffic from the other types of transactions. In addition, if you want some loan, purchase, inventory and sale data, you'll have to query the table multiple times.

Generally, problematic schema designs rear themselves during reporting/data analysis/querying. The more cumbersome the queries to get your information, the obvious it is that the schema does not match the business need. If you have analyzed the business needs and are taking a calculated risk based on that analysis, then my only suggestion would be to document your design and design rationale for the next developer.

Thomas
@Thomas- Depending on the transaction type, money amounts are grouped by transaction type and posted as credits or debits to a general ledger. I wish you could view a video-demo of this dynamic, one-page perform screen "does-it-all" and how it properly sums up by tran type. Queries and updates have become so simplified vs. multiple joins and one transaction number column increments irregardless of tran type, like a cash register!
Frank Computer
@Frank Computer - If the entity is merchandise then why did you name it "Transaction"? Simple question: Does every transaction type have *exactly* the same attributes and do all attributes mean the same thing regardless of transaction type? If there is even one exception, storing them in a single table is not first normal form. However, that may not matter. You'll know when you get to reporting.
Thomas
@Frank Computer - Let me reiterate something I said in my post. If you are building whatever reports or analysis required and it is *easier* than if it were 1NF, then your design matches the business requirements and 1NF be damned. There are (rare) times when one needs to break normalization in order to provide the best solution.
Thomas
@Thomas- semantics!.. pawnshops have customers who perform pawn, purchase, sale transactions of merchandise. The goal in this app is to trace merchandise from the moment it first arrives at pawnshop, sold or until melted into final oblivion. Example: according to the historic DSS table, 84% of customers pawn, pay avg. of 2.3 months interest, redeem pawn and re-pawn same merchandise 13.6 days avg. later, whereas new row was created for same customer with same merchandise, now initial pawn/redeem row is updated when re-pawned. As same merchandise morphs through tran types, detail info is stored
Frank Computer
@Thomas- Reports became easier, example: by law, pawns have to be retained 30 days after customer forfeits for lack of payment. Purchases retained for 20 days, incase merchandise is stolen. So, a daily report prints how long each forfeited pawns and each purchase has been sitting in the same holding tank and since were using serialized transaction numbers, its easier to pullout the merchandise with expired retention periods from the one box in the vault holds it all. Oh and I also have a "repair" tranny type for customers who leave jewelry to be repaired.. it all fits in, it all works great!
Frank Computer
@Thomas- If you visit: www.frankcomputer.com you can view a video-demo of the pre-consolidated pawn transaction screen and the separate purchase transaction screen, but the pawn tran screen will give you a good idea of how that row keeps getting updated as customer makes int.pymts or redeems. I just added more tran types to same screen and made it a generic tran screen.. caveat: the demo's in spanish, but 2 mins. into it, you'll see the pawn screen and get my idea, maybe brush up on some spanish too!
Frank Computer
@Frank Computer - Naming is important. "Merchandise" implies, if not defined as, a physical item; in this case, the item being pawned. That is different than the transaction in the item was transferred/sold to the shop or the state of ownership/control that pawned item. The item that was pawned is an attribute of the transaction. Thus, while I think "Transactions" is a good name for the table in question, "Merchandise" would not. Like I said, if the requested reports and analysis are easy to build then I would say that the schema fits the business requirements and who cares whether it is 1NF.
Thomas
+3  A: 

Your table second design is better, and, is "normalised".

You first design was the denormalised one!

You are basiclly following a database design/modelling pattern known as "subtype/supertype" for dealing with things like transactions where there is a lot of common data and some data specific to each tranasaction type.

There are two accepted ways to model this. If the the variable data is minimal you hold everthing in a single table with the transaction type specfic attributes held in "nullable" columns. (Which is essentially your case and you have done the correct thing!).

The other case is where the "uncommon" data varies wildly depending on transaction type in which case you have a table which holds all the "common" attributes, and a table for each type which holds the "uncommon" attributes for that "type".

However while "LOAN", "PURCHASE" and "SALE" are valid as transactions. I think Inventory is a different entity and should have a table on its own. Essentially "LOAN" wll add to inventory transaction, "PURCHASE" wll change of inventory status to Saleable and "SALE" wll remove the item from inventory (or change status to sold). Once an item is added to inventory only its status should change (its still a widget, violin or whatever).

James Anderson
@James- Since the pawned or purchased merchandise originated from a particular customer, with a unique serial transaction number assigned to it, should pawnshop transfer merchandise into inventory, pawnshop wants to know merchandise's history, previous transaction number, etc. when inventory is sold, it may record and store buyers name, address, etc. or it may be an annonymous "CASH SALE", but each item is micro-tagged with the original transaction number in order to trace it, should it re-appear at pawnshop or reported stolen.
Frank Computer
@James- Pawnshop inventory is defined as all items that pawnshop will stock for sale to public. Merchandise can fall into the pawnshops inventory by the following methods: 1) A pawned item is forfeited by the customer for lack of interest payments or customer did not redeem it. 2) Pawnshop purchases the merchandise from the customer or another entity. Some forfeited or purchased items, like gold, silver, platinum, rhodium or other precious metals may never be stocked in inventory because the pawnshop will sell it directly to a refinery who will pay pawnshop according to purity and weight.
Frank Computer