views:

40

answers:

2

I've often seen line_item tables for orders or invoices that copy one or more fields from other tables in order to take a snap-shot of a customer's product order when it was placed.

In my schema, however, I can generate a view of an order without copying data. So looking up the order/product/price data is a little more expensive, but I save time, space and redundancy on the copy/insert. I understand the copy/insert is a one-time transaction, whereas the look-up will be required many times - however, I'm only dealing with 10s of thousands of records in a given table and I don't expect performance to be an issue.

So, because a) my schema supports an accurate look-up without a snap-shot, and b) I don't have a strong need for look-up optimization, I think it makes sense to run a calculation instead of taking a snap-shot. Or is there something I'm missing and should I always take a snap-shot in cases like this?

Here's an example of what the look-up calculation would look like:

# display order items for a particular order on a particular date

# get order, products and base prices from order_id

order_products = SELECT * FROM order_has_product ohp
                          INNER JOIN price ON (price.product_id = ohp.product_id)
                          INNER JOIN order ON (order.id = ohp.order_id)
                          WHERE order_id = ?


# calculate price of each product at order.datetime_opened

for op in order_products:

    tax = SELECT SUM(tax.rate) FROM product_has_tax pht
             INNER JOIN tax ON (tax.id = pht.tax_id)
             WHERE pht.product_id = op.product_id
                 AND tax.date_start <= op.datetime_opened
                 AND tax.date_end >= op.datetime_opened

    discount_product = SELECT SUM(discount.rate) FROM product_has_discount phd
             INNER JOIN discount ON (discount.id = phd.discount_id)
             WHERE phd.product_id = op.product_id
                 AND discount.date_start <= op.datetime_opened
                 AND discount.date_end >= op.datetime_opened

    discount_customer = SELECT SUM(discount.rate) FROM customer_has_discount chd 
             INNER JOIN discount ON (discount.id = chd.discount_id)
             WHERE chd.customer_id = op.customer_id
                 AND discount.date_start <= op.datetime_opened
                 AND discount.date_end >= op.datetime_opened
                 AND (chd.date_used_limited IS NULL OR chd.date_used_limited = op.datetime_opened)

    discount = discount_product + discount_customer

    price = op.price * (1-discount) * (1+tax)
+1  A: 

Usually snapshots are taking to make it easy to insure you have a faithful record of each sale, be it for datawarehousing or for customer complaint handling. With snapshots it's just a matter of keeping a single table safe, backed up and with strict auditing.

Doing it your way makes having that guarantee a lot harder, mainly because you have to ensure nobody has messed in any way with any of the involved tables (discount, tax, product, order and so on). For example, how would you tell if somebody changed the discount rate for the january 2005? Additionally, this prevents you from evolving your data model in an easy way, what if you now have to have more than one column for the discount rate, you would then not only have to change the calculations for the future but to keep the old ones for the past (or else do every change in a backwards compatible way.)

Space is cheap, having snapshots makes lots of things easier, at a small cost.

Vinko Vrsalovic
+5  A: 

In the online sales application I worked on, we always copied calculated taxes and prices to each individual order when it was placed; this is your "snapshot" option. We never recalculated it. Why?

  • Prices change; the snapshot captures the price at the time the product was ordered, not what it is now.
  • Tables change. You are relying on every producer and consumer of your prices to know about the view and use only the view. Forever. If you provide the snapshot, future data consumers don't care how it was calculated.
  • Schemas change. Ontario is changing how it calculates sales tax and what products are taxed. This breaks the view, but not a snapshot.
  • Rules change. We had to provide a way for clients to override rules for things like coupons and price-matching. You can do this by overriding the snapshotted price, you can't do it with views.
Dour High Arch
Prices and taxes changing is not a problem because they are separate entities in my schema, with start/end dates to look up their values at a given time. But great point that the snapshots will make the data more transparent for future data consumers. And an even better point: my discount rules are a headache. Using my approach, I would have to save the old rules if they changed! Thanks for pointing that out!
Tony
I tried to edit it to make it clearer that in my example it is not the tax amount that is changing, the way taxes are calculated is changing; it will require a different database schema.
Dour High Arch