views:

139

answers:

3

I need to design a table for daily sales of pharmaceutical products.

There are hundreds of types of products available {Name, code}.

Thousands of sales-persons are employed to sell those products{name, code}.

They collect products from different depots{name, code}.

They work in different Areas -> Zones -> Markets -> Outlets, etc. {All have names and codes}

Each product has various types of prices {Production Price, Trade Price, Business Price, Discount Price, etc.}. And, sales-persons are free to choose from those combination to estimate the sales price.

The problem is, daily sales requires huge amount of data-entry. Within couple of years there may be gigabytes of data (if not terabytes). If I need to show daily, weekly, monthly, quarterly and yearly sales reports there will be various types of sql queries I shall need.

This is my initial design:

Product {ID, Code, Name, IsActive}
ProductXYZPriceHistory {ID, ProductID, Date, EffectDate, Price, IsCurrent}
SalesPerson {ID, Code, Name, JoinDate, and so on..., IsActive}
SalesPersonSalesAraeaHistory {ID, SalesPersonID, SalesAreaID, IsCurrent}
Depot {ID, Code, Name, IsActive}
Outlet {ID, Code, Name, AreaID, IsActive}
AreaHierarchy {ID, Code, Name, PrentID, AreaLevel, IsActive}
DailySales {ID, ProductID, SalesPersonID, OutletID, Date, PriceID, SalesPrice, Discount, etc...}

Now, apart from indexing, how can I normalize my DailySales table to have a fine grained design that I shall not need to change for years to come?

Please show me a sample design of only the DailySales data-entry table (from which all types of reports would be queried) on the basis of above information.

I don't need a detailed design advice. I just need an advice regarding only the DailySales table. Is there any way to break this particular table to achieve granularity?

+1  A: 

Why not put a date and price on the product, so that you can drop the price from the dailysales table, since you can get it by joining.

Unless the price can be changed by the salesman, without any rationale in the database.

On a given date, can a salesman only be in one outlet? If so then you can drop the outletid.

You have PriceiD, SalesPrice and Discount. If I know the discount and outlet id and original price then I can determine taxes and so calculate the SalesPrice, so you can possibly drop that.

But, this would imply that you are storing tax info by date, to track what it was when the sale took place.

My point is that you should look at what already exists in another table and you can then simplify the daily sales table.

You will want to pull information from it into staging tables by day/month/year, for example, to help aggregate the data by date, so that your reports will be generated more quickly.

You have a lot of unknowns in your question, but hopefully this will help.

Update: Based on comment

I had a table that contained usage info on when resources were used by someone, and that table got large quickly. So, we decided that we would only keep 2 or 3 years of data, and the rest would be aggregated, and the raw data dumped to a file for archival purposes.

When looking at the number of rows, you will need to decide how much data you need to keep, and how you can archive the old data, to make it available if it is absolutely needed, but, you can generate the reports that should be needed beforehand.

By reducing the number of columns you will have a large impact on storage space, in case that is a concern, as many of the columns will probably not be null.

James Black
I am actually not concerned about the number of columns. I am concerned about the number of rows the table will contain after years of sales.
JMSA
And I think I have already said that I am only concerned about DailySales table.
JMSA
Do you need to keep the individual rows after 5 yrs, or can you start to aggregate the older ones and leave it like that, so you can delete them from the table, as they are basically archived?
James Black
@JMSA - I was trying to show how to reduce columns in the DailySales table, by looking at what is already available in other tables.
James Black
+1  A: 

If you are going to generate massive amounts of data and need to generate reports based on past data, you should consider using a business intelligence engine. Typically these engines allow you to archive historical data in a separate data store (so that you don't clutter the daily work database), yet to obtain statistical data and reports out of the archived data.

Konamiman
+1  A: 

What you are looking for is called data warehouse (DW). I suggest you take a look at "The Data Warehouse Toolkit" by Ralph Kimball -- it has examples of data warehouse designs for retail sales. Here is a very simplified (first draft) example of how it may look like. You will notice that this is a de-normalized structure optimized for reporting and analytic. The grain of the fact table is usually one item (line) on a receipt. Hope this points you towards your solution. Few terabytes for a DW is ok.


alt text

Damir Sudarevic