views:

420

answers:

4

Is there any problem with making all your Sql Server 2008 string columns varchar(max). My allowable string sizes are managed by the application. The database should just persist what I give it. Will I take a performance hit by declaring all string columns to be of type varchar(max) in Sql Server 2008, no matter what the size of the data that actually goes into them?

+1  A: 

Simon Sabin wrote a post on this some time back. I don't have the time to grab it now, but you should search for it, because he comes up with the conclusion that you shouldn't use varchar(max) by default.

Edited: Simon's got a few posts about varchar(max). The links in the comments below show this quite nicely. I think the most significant one is http://sqlblogcasts.com/blogs/simons/archive/2009/07/11/String-concatenation-with-max-types-stops-plan-caching.aspx, which talks about the effect of varchar(max) on plan caching. The general principle is to be careful. If you don't need it to be max, then don't use max - if you need more than 8000 characters, then sure... go for it.

Rob Farley
This? http://sqlblogcasts.com/blogs/simons/archive/2005/11/18/when-is-varchar-max--not-a-varchar-max-.aspx
OMG Ponies
nope, this one: http://sqlblogcasts.com/blogs/simons/archive/2006/02/28/Why-use-anything-but-varchar_2800_max_2900_.aspx
gbn
The first one. By OMG Ponies.
Rob Farley
Sorry I didn't have time to find the actual link, I was about to step into a meeting when I wanted to fire off the answer.
Rob Farley
And more significantly, this one: http://sqlblogcasts.com/blogs/simons/archive/2009/07/11/String-concatenation-with-max-types-stops-plan-caching.aspx
Rob Farley
This sounds like a bug that occurs in an odd situation. And the comments on the article make it sound like it's already been fixed or is going to be fixed. Is this the most serious downside to using varchar(max) everywhere? Because if so varchar(max) everywhere is sounding pretty nice.
BowserKingKoopa
Well, it can be 'nice' for the situation of "We might suddenly need to put an essay in for someone's surname", but when you're wondering how SQL is going to handle that in various situations (particularly calculations and indexes), then you find yourself feeling like the 'nice' is just a trade-off.
Rob Farley
+1  A: 

Yes, you will take a performance hit.

If a column is char(20), then to get to the next column, the database only has to do ptr += 20. If it's varchar, it needs to store and read the length of the string from the database, which is an extra trip to the hard-drive for every record.

It also causes potential problems with fragmenting data.

BlueRaja - Danny Pflughoeft
This is an argument for using `char` over `varchar`, though, not `varchar(50)` over `varchar(MAX)`.
Adam Robinson
He didn't ask *(originally)* if he would benefit over `varchar(max)` with `varchar(50)`, he asked if `varchar(max)` would incur a performance penalty. Yes, it will, especially when compared with `char(50)`.
BlueRaja - Danny Pflughoeft
A: 

Ideally, you should only allocate what you need. Meaning if you're certain a particular column (say a username column) is never going to be more than 20 characters long, using a VARCHAR(20) vs. a VARCHAR(MAX) lets the database optimize queries and data structures.

From MSDN: http://msdn.microsoft.com/en-us/library/ms176089.aspx

Variable-length, non-Unicode character data. n can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes.

Are you really going ever going to come close to 2^31-1 bytes for these columns?

pygorex1
+2  A: 

Indexes can not be over 900 bytes wide for one. So you can probably never create an index. If your data is less then 900 bytes, use varchar(900).

This is one downside: because it gives

  • really bad searching performance
  • no unique constraints
gbn
But what if the varchar(max) column doesn't have any values that are greater than 900 bytes? Would it index then? I'm confused because a lot of what I'm reading make varchar column types sound like they auto size themselves up to their max as data is entered. This would be perfect for what I want, because it's the application that should be deciding the max, not the database.
BowserKingKoopa
You'd get a warning when you create the index and an error when you tried to insert > 900. But if your data is always < 900, why not use 900? Yes, they are stored as variable length strings though.
gbn
I don't know if my data is always < 900. That's a business logic concern. If that rule changes I should change it in the business logic. I shouldn't have to also change the database. That's my goal anyway. To see if I can get away with taking the concern about string sizes away from the database without a noticeable performance hit.
BowserKingKoopa

related questions