Which way is better to store sales information in a database:
- Store cost for an item and sales price per item
- Store cost * qty and sales price * qty - so you are storing the totals in the db
Which way is better to store sales information in a database:
I would say you always store cost and price as seperate columns AND store the total. There will be a time when you need those broken out (reporting, invoicing) that you will regret.
The question will really become should you store the calculated values as well. that will really depend on the circumstances of the application (how many rows, how many transactions, etc.). If its a small - medium application, it probably will not matter for performance so it's just a matter of other factors.
If the formula for the total may become more complicated (ex. taxes, discounts, etc..), you may want to store the total. otherwise, having to replicate that logic in other areas may prove to be extremely difficult.
I personally always store the total amount as well. I just find the cost of the extra database column is always outweighed by having that total in other places.
I'd go with storing the cost for an item and the sales price per item. If you're only storing the totals, you'll have a hard time recalculating your values if the prices change. Imagine giving a seasonal discount on some of your items, applying different taxes to some items (maybe even different ones depending on the customer's country of origin) or granting lower prices for registered customers - you'll have to calculate those prices based on the sales price of an item, the totals are of no use in this case.
And if you really need the totals, you can easily calculate them any time using simple DB-queries, taking into account all discounts, etc. etc.
IMHO store cost and sales per item. If you want the results to come like totals, either perform them inside the Query or create a temporally table or a stored procedure with the calculations. However, I usually perform those calculations in the code (PHP or JAVA in my case). If it is only for report purposes, sometimes is fine to store the totals as speed-up the calculations (and prevents some miss-calculations in your code).
This is the 3rd Normal Form problem.
Option 1 is in 3rd Normal Form. There is no derived data. Calculation must be done for every query. Because of this, updates can be done on any field without breaking anything.
Option 2 breaks 3rd Normal Form. It saves derived data. Calculations are not done at query time, making them much faster. However, an update that doesn't also redo the calculation will lead to inconsistent data. This is called an "update anomaly". An update cause the derived data field to be inconsistent.
Further -- depending on the calculation -- it may be impossible to reason out what fields should change when the derived data needs to be changed.
Store all single details for each item, as you will usually need them later (for statistical/information purposes or any other reasons). You have to make sure that you always follow the same calculation logic to get the total sales price. So you maybe have to store currencies, discount, price unit, etc. along with these. Here some more points to consider:
If you want to output the single item price later, you need the single item data.
Don't save total price along with single item data, as you will have to always hold both in sync. You'll get puzzled some month/years later, about which of both to use for your calculations.
Aggregate your data if you are using it in a data warehouse and don't need the details.
I would only store the totals if I have a serious reporting problem where many queries require the total value of a large set of records (if you are only getting the total for a small set, calcuating it onthe fly when requested is OK).
If you store the totals, you need triggers to ensure that updates, deletes and inserts all correctly update the totals. Otherwise the data will be out of synch and useless.
It depends on what you are going to do with the data.
For a great many purposes, it's worthwhile to store, under each item, the unit price of that item, the quantity purchased, and the extended price. The extended price is quantity times unit price. Is this redundant? well, yes. Does it violate some normal form? Well, yes. But it works quite well.
Why store the price in the item row (record) and not just rely on the same price data stored in the master product table? Because if you change the price after this sale, you don't want to change the price that this customer incurred on this purchase. Why store the extended price, since this can be recomputed on the fly? Because it makes it one step simpler to aggregate (sum or average) over some set of items later.
If you store the extended price, you can use those point and click, drill down analysis tools (see OLAP) for later analysis, with no further programming. If you count on recomputing the extended price when you need it, you may find that the drill down tool isn't smart enough to do the multiplying for you.
"items" will normally be children of some larger unit of work. In an invoicing system, an item is part of an invoice. In a general ledger system, an item is part of a transaction. The items aren't balanced, but the transaction is. In a lot of commerce systems these days, the invoicing function and the accounting function are driven off of the same database, but that goes beyond your question.
Normally, I try to keep my data normalized. This is one place where many experts intentionally deviate from the requirements of pure normalization.