views:

160

answers:

2

Hi folks,

I have a table that contains a few columns and then 2 final (nullable) columns which are varbinary (actually, they are SQL 2008 geography types, but I want to keep this post database agnostic).

I've hit around 500mb with around 200K rows. The varbinary is the problem - and I need the data.

So, I was wondering if it's bad if I do the following:-

  • Create a separate FILEGROUP: SpatialData.mdf
  • Create a new table, assigned to that new filegroup.
  • Move all the spatial data (read: last two fields) out of the original table and into the new table. The new table has a foreign key against the original table.
  • Create a view representing both tables.

Now, the view will be a left outer join because the relationship is: the new table has a zero or one row relationship to the original table.

EG.

Original Table

FooId INT PK NOT NULL IDENTITY
Blah VARCHAR(..) NOT NULL
Boo WHATEVER NOT NULL

New Table

FooID PK FK NOT NULL
Spatial_A VARBINARY(MAX)/GEOGRAPHY
Spatial_B VARBINARY(MAX)/GEOGRAPHY

The reason why I want to know if this is bad is because of the view and how the view is doing a join on the spatial table. I'll be using the view a lot. Currently, I'm just doing queries against the original table (because the new table doesn't exist just yet). By adding this join and the PK/FK relationship, will this impact performance?

Why split the data? I need to download the live DB to our dev servers now and then. We don't really care too much about those two spatial fields, so not having them is fine. Therefor, the size of the database to download will be much smaller.

Thoughts?

+1  A: 

Instead of creating a second table, joining, and creating a view, a better solution that is possible with SQL Server 2005/2008 is to use table partitioning. To my recollection, you can vertically partition a table, and place some columns (i.e. your geospatial columns) in one file group, while putting the rest in another file group. SQL Server will handle the rest for you, you don't need to bother with a join, and you don't need a view.

jrista
Table Partitioning eh ... that sounds UBER KEWL :) I need to check this out...
Pure.Krome
Uber kewl it is. :D And a performance tuning option too, when you need it.
jrista
Jrista ... i can't seem to find any links on how to define a field (on a table) to a different filegroup. This is what vertical partitioning is, right? Could you update your answer to include some links, please?
Pure.Krome
I think you need to search for "Partition View". Its been some time since I did any table partitioning in SQL Server, but I remember it was fairly difficult to find all the right information, too.
jrista
Oy, Krome, my apologies. I think I've been confusing Replication partitioning with table partitioning. Replication can be set up to partition both horizontally and vertically...however, SQL Server 2005/2008 only allow horizontal partitioning schemes. Vertical partitioning, or what they call row splitting, is still done manually by pulling "slow" columns into a second table, and joining with a custom view. Sorry for the runaround.
jrista
+1  A: 

The method that you've described is actually fairly common in my experience. Technically, if you were to normalize your database to the fullest extent you would have a lot of tables like that since one of the (usually not used) steps in normalization includes making sure that no columns have NULL values.

In practice it isn't usually carried out to that extent, but for a column (or columns) that is sparsely populated it's not a bad idea to separate it out. As long as the tables share the same primary key (which will of course be indexed), performance shouldn't be a problem.

Tom H.