views:

39

answers:

3

Suppose I have the following table schema:

ID | Name | Width | Height | Size

What are considerations to think about when breaking this table into two tables with one-to-one relationship?

E.g.:

ID | Name
and
ID | Width | Height | Size

My concern is that I'll have a lot of columns on the table (not just five, here is for illustration purposes only with high possibility of adding new columns in the future). I'm worried that bigger table row sizes can have negative impact on performance and/or clarity of design. Is this true? And performance hit compared with joining tables.

A: 

You should probably group related information (Like the Name of the product and the specs), mainly because the information goes together.

If you want to separate them, the only real 'boost' you will have is when you only want to access the first table, because the second table requires information from the first table to be useful. (You need the name to identify the product)

I'd probably stick with the original solution unless you wanted to make a Table for Sizes, and a table for products, and a relationship between those two.

Chacha102
I may not always need the information on the second table (or third, or fourth). And when I do need it, I can put it on lazy load.
Adrian Godong
My Question is are the specific sets of Width - Height - Size - Etc values? Like if most products only come with X variations, it would probably be better to make a table of variations, and the left join that with meta info.
Chacha102
Those were just illustration, and no, each product will only have one variation and that is almost always unique.
Adrian Godong
+1  A: 

As per BOL:

Surpassing the 8,060-byte row-size limit might affect performance because SQL Server 2005 Database Engine still maintains a limit of 8 KB per page. When a combination of varchar, nvarchar, varbinary, sql_variant, or CLR user-defined type columns exceeds this limit, the Database Engine moves the record column with the largest width to another page in the ROW_OVERFLOW_DATA allocation unit, while maintaining a 24-byte pointer on the original page. Moving large records to another page occurs dynamically as records are lengthened based on update operations. Update operations that shorten records may cause records to be moved back to the original page in the IN_ROW_DATA allocation unit. Also, querying and performing other select operations, such as sorts or joins on large records that contain row-overflow data slows processing time, because these records are processed synchronously instead of asynchronously.

Raj

Raj
Have you tried benchmarking the difference between this and overhead of joins?
Adrian Godong
An honest reply would be NO. But I have encountered this scenario, when I found a table with row data in excess of 8060 bytes was performing poorly. I could see good improvement with vertical partitioning did help. Vertical partitioning lets queries scan less data. This increases query performance. A table that contains 7 columns of which only 4 are generally referenced may benefit from splitting the last three columns into a separate table.
Raj
A: 

First off your approach is wrong. Design your database how you think it should be designed by using appropriate normalization and so on. Only when you have an actual performance problem should you start doing things like worrying about this kind of thing.

The number of columns really doesn't matter beyond row limits and the like. If you get to hundreds (or thousands) of columns you might get to the point where you need to split it into several tables but don't worry about it until it actually happens.

cletus
I'm trying to denormalize since I can foresee a problem emerging and would like to intercept that ASAP. Not after we hit a performance problem.
Adrian Godong