views:

69

answers:

4

In our inventory database (SQL Server 2008 std edition) we have a table (called Stock Results) that stores results for each stock item by stock period, that looks like this:

<< StockResults >>
PK StockPeriodID int
PK StockItemID int 
OStockCost money 
OStockQty real 
DeliveriesQty real 
CreditsQty real 
TransfersInQty real 
TransfersOutQty real 
CStockQty real 
OStockAmt money 
DeliveriesAmt money 
CreditsAmt money 
TransfersInAmt money 
TransfersOutAmt money 
CStockAmt money
... except that it has about 40 columns

We are considering normalising that table, so that we have a table for fields and another for data. Like this:

create table StockResults_Fields
(FieldID int, FieldName varchar(20), FieldDataType varchar(10))

create table StockResults_Values
(StockPeriodID int, StockItemID int, FieldID int, FieldName varchar(20), FieldDataType varchar(10))

The reason we are considering doing that is to improve the performance of the table and to prevent deadlocks (which we are currently getting). The advice on normalizing to reduce deadlocks comes from this article: Reducing SQL Server Deadlocks.

My concerns are that the results table (which is already large), will get even bigger. And most of the reports display data in a structure that is similiar to the current structure -- the new way will have quite a few more joins.

Before we start on something that will involve quite a lot of work, does anyone have any advice on this normalized structure for results and the performance benefits before we start?

EDIT: Thanks for the advice. I had a gut feeling that the 2-table approach wasn't the way to go, but I wasn't sure why -- until now. The locking error has been solved: we had a table with no clustered index, but the snapshot isolation looks like something we might consider.

+2  A: 

You could try first changing the db to Snapshot Isolation level to see if this reduces contention\locks. I did this recently for an MS Dynamics installation (large tables with many columns) and it has worked a treat (yes I got the ok from Microsoft first!). If it works it would be a much quicker solution than your table refactor proposal.

redsquare
Thanks for the advice about the snaphot isolation.
Craig HB
@craig-hb did you resolve your issues, I would be interested in what you did/tried and what worked.
redsquare
ah I see it now in your edit.
redsquare
A: 

You might be able to get rid of the problem with deadlocks here, but for performance, have you tried partitioning the table into separate spindels? You can also change the snapshot isolation level as redsquare says, or you can add WITH (NOLOCK) to your SELECT queries, if you don't want all queries to have the same isolation level

Vidar Nordnes
+7  A: 

It sounds like you know all the needed columns at the time of designing the system. If that's the case, you should absolutely not proceed with the design you proposed.

The only possible reason for that kind of design is if you don't know all the fields you will need at design time, and need to add some after you are in production.

I would predict that your two-table approach would perform much worse than your current approach.

Also, this has nothing to do with normalization, at least by my definition. What you would be doing is moving away from a relational model and toward a metadata model.

(Edit: you should also post more info about when/where the deadlocks are occurring, if that is the root of the problem you are trying to solve).

Phil Sandler
+1. We do a lot of development using tables storing name/value pairs. I say this simply to iterate that if the reporting structure already uses all of those fields and you already know the columns at system design, then do NOT go down this path. It has it's own performance concerns and generally requires some serious DBA experience to pull off correctly.
Chris Lively
+1  A: 

I would guess that one of the reasons you are getting deadlocks is that you are trying to do too much in the table and do need normalized child tables, just not what you suggested (which is NOT normalizing the data by any stretch of the imagination) (Please read http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/ for reasons why your suggestion is a bad idea).

It seems to me that your table structure is constantly being updated because it summarizes data. Put the changes to the data into separate rows and then summarize in reports or a data warehouse that is only updated periodically. Or even do the summary in a view.But do the data changes to the tables not through the view.

So I would have a stock table to define the partnumber and descriptive details of the stock and then one child table for each type of data that could change over time.

So a stock_transaction table would start with the stock id and the number of items If you get a new piece of stock it would add a record with the total number received. Then if you issue an item, it would add a record with a -1 in the number of items, etc. Then to find the total number of items you have you sum the data. To find the total number of stock items issued, you would sum the negative values and take an absolute value) This is how most warehouse stocking applications I've worked with work. Also you have additional fields so that you can track who the stock went to and the date of the transaction so you can see overall what is happening at a detail level.

Without having a better picture of your whole table right now and what the fileds mean and how they are populated, I can't suggest which child tables might need to be created. But you should have fewer deadlcks and a beeter definition of what is happening in your system if you separate out the data into it's natural child records. Just think, if the data needs to be updated frequently, that field is a candidate for a child table. You can probably group several of your current fields in a child table, it depends on how much the information is related.

However, restructuring like this is a major effort and will take much time to do and to migrate and test the data and the application changes. If you can get better performance right now by partitioning or using snapshot isolation as others have suggested, that sort of thing would be the way to go. If you get better performance but it isn't good enough, then redesign, but use the other techiniques to improve performance right now while you spend the solid year or so it will take to redeign this. I only make this suggestion in case you can't fix the performance problem any other way to show you waht really normalizing the data woudl look like and to give you an idea of how expensive following this path will be.

HLGEM
You are right. What the OP is doing cannot be considered normalization.
Walter Mitty
Great article that you linked to! Poor old Randy and his database to end all databases.
Craig HB