tags:

views:

23336

answers:

4

Hello everyone,

I have variable length character data and want to store in SQL Server (2005) database. I want to learn some best practices about how to choose TEXT SQL type or choose VARCHAR SQL type, pros and cons in performance/footprint/function.

thanks in advance, George

+33  A: 

TEXT is used for large pieces of string data. If the length of the field exceeed a certain threshold, the TEXT is stored out of row.

VARCHAR is always stored in row and has a limit of 8000 characters. If you try to create a VARCHAR(x), where x > 8000, you get an error:

Server: Msg 131, Level 15, State 3, Line 1
The size () given to the type ‘varchar’ exceeds the maximum allowed for any data type (8000) 

These length limitations do not concern VARCHAR(MAX) in SQL SERVER 2005, which may be stored out of row, just like TEXT.

Note that MAX is not a kind of constant here, VARCHAR and VARCHAR(MAX) are very different types, the latter being very close to TEXT.

In prior versions of MS SQL SERVER you could not access the TEXT directly, you only could get a TEXTPTR and use it in READTEXT and WRITETEXT functions.

In MS SQL SERVER 2005 you can directly access TEXT columns (though you still need an explicit cast to VARCHAR to assign a value for them).

TEXT is good:

  • If you need to store large texts in your database
  • If you do not search on the value of the column
  • If you select this column rarely and do not join on it.

VARCHAR is good:

  • If you store little strings
  • If you search on the string value
  • If you always select it or use it in joins.

By selecting here I mean issuing any queries that return the value of the column.

By searching here I mean issuing any queries whose result depends on the value of the TEXT or VARCHAR column. This includes using it in any JOIN or WHERE condition.

As the TEXT is stored out of row, the queries not involving the TEXT column are usually faster.

Some examples of what TEXT is good for:

  • Blog comments
  • Wiki pages
  • Code source

Some examples of what VARCHAR is good for:

  • Usernames
  • Page titles
  • Filenames

As a rule of thumb, if you ever need you text value to exceed 200 characters AND do not use join on this column, use TEXT.

Otherwise use VARCHAR.

P. S. The same applies to UNICODE enabled NTEXT and NVARCHAR as well, which you should use for examples above.

P. P. S. The same applies to VARCHAR(MAX) and NVARCHAR(MAX) that SQL Server 2005+ uses instead of TEXT and NTEXT. You'll need to enable large value types out of row for them with sp_tableoption if you want them to be always stored out of row.

As mentioned above and here, TEXT is going to be deprecated in future releases:

The text in row option will be removed in a future version of SQL Server. Avoid using this option in new development work, and plan to modify applications that currently use text in row. We recommend that you store large data by using the varchar(max), nvarchar(max), or varbinary(max) data types. To control in-row and out-of-row behavior of these data types, use the large value types out of row option.

Quassnoi
1. "If you do not search on the value of the column " -- could you show me what do you mean "search"? You mean select this column, order this column, LIKE this column or using some string manipulation function on this column?
George2
2. "VARCHAR is always stored in row and has a limit of 8000 characters." -- sorry I do not agree with you. VARCHAR could be longer than 8000 and if longer than 8000, VARCHAR will be stored other than in columns. Any comments?
George2
3. Mladen Prajdic mentioned in this thread, TEXT type is deprecated, but I do not find any documents covers this. Do you have any documents covers this?
George2
See updated post
Quassnoi
Cool Quassnoi! You are so knowlegeable! :-)One more question -- "This of course does not concern VARCHAR(MAX), which is as for SQL SERVER 2005 a synonym for TEXT." "This" you mean what?
George2
"This of course does not concern VARCHAR(MAX), which is as for SQL SERVER 2005 a synonym for TEXT." -- do you have any documents which says TEXT is the same as VARCHAR in SQL Server 2005? I did some search but can not find official documents. :-)
George2
BTW: I do not think TEXT and VARCHAR are exactly the same, since TEXT could be longer than 8000 characters. Any comments?
George2
VARCHAR and VARCHAR(MAX) are very different types. TEXT and VARCHAR(MAX) are same in terms of how are stored, but different in terms of how you can access them.
Quassnoi
Of course TEXT and VARCHAR(8000) are not the same. You cannot create a VARCHAR(8001), for instance, you'll need VARCHAR(MAX).
Quassnoi
Sorry, I am confused. I always think there is only one data type in SQL Server called VARCHAR, so VARCHAR and VARCHAR(MAX) are the same. Am I wrong? :-(Could you provide some links or documents about this topic please?
George2
"but different in terms of how you can access them" -- what do you mean access? Could you show me an example please? :-)
George2
VARCHAR and VARCHAR(MAX) are different types IN TERMS OF HOW THEY ARE STORED. As for TEXT values, you cannot do UPDATE table SET text_column = integer_column, you'll need to do UPDATE table SET text_column = CAST(integer_column AS VARCHAR)
Quassnoi
since SQL server 2005 text has been "replaced" with varchar(max) or nvarchar(max), so the answer is based on a fallacy. Sorry.
Nikos Steiakakis
+15  A: 

if you're using SQL Server 2005 use varchar(MAX). Text datatype is deprecated and should not be used for new development work.

Mladen Prajdic
Thanks Mladen, I am surprised to see TEXT is deprecated. Do you have any official documents mentioning this?
George2
achinda99
this is as offcial as it gets :)http://msdn.microsoft.com/en-us/library/ms187993.aspx
Mladen Prajdic
Cool achinda99 and Mladen Prajdic! What you provided is what I am looking for. :-)One more question, how do we choose whether to use VARCHAR or VARCHAR(MAX) in different situations?
George2
read this: http://www.sqljunkies.com/WebLog/simons/archive/2006/02/28/Why_use_anything_but_varchar_max.aspx
Mladen Prajdic
+9  A: 

In SQL server 2005 new datatypes were introduced: varchar(max) and nvarchar(max) They have the advantages of the old text type: they can contain op to 2GB of data, but they also have most of the advantages of varchar and nvarchar. Among these advantages are the ability to use string manipulation functions such as substring().

Also, varchar(max) is stored in the table's (disk/memory) space while the size is below 8Kb. Only when you place more data in the field, it's is stored out of the table's space. Data stored in the table's space is (usually) retrieved quicker.

In short, never use Text, as there is a better alternative: (n)varchar(max). And only use varchar(max) when a regular varchar is not big enough, ie if you expect teh string that you're going to store will exceed 8000 characters.

As was noted, you can use SUBSTRING on the TEXT datatype,but only as long the TEXT fields contains less than 8000 characters.

edosoft
Thanks Edoode, you answered quite whole how good VARCHAR is, but any comments or ideas about when to use VARCHAR and when to use TEXT? My question is about choosing 1 from 2 issue. :-)
George2
Actually, in MS SQL Server 2005 you can use SUBSTRING and other functions on TEXT columns too.
Quassnoi
Thanks Quassnoi! Looks like TEXT is deprecated. One more question, how do we choose whether to use VARCHAR or VARCHAR(MAX) in different situations?
George2
Only use varchar(max) when a regular varchar is not big enough (8Kb should be enough for everybody ;)
edosoft
+2  A: 

Hi

There has been some major changes in ms 2008 -> Might be worth considering the following article when making a decisions on what data type to use. http://msdn.microsoft.com/en-us/library/ms143432.aspx

Bytes per

  1. varchar(max), varbinary(max), xml, text, or image column 2^31-1 2^31-1
  2. nvarchar(max) column 2^30-1 2^30-1
Draz