views:

392

answers:

4

The string in question would be the description field of a (cooking) recipe, and the max length should be something that 99% of users should never run into. nvarchar(4000) seems like it's probably too limiting.

Is a column in a SQL table even the appropriate place for this? It doesn't feel right for storing such a (potentially) large value in a field like this, but maybe not?

Not sure it matters, but .NET 3.5 most likely going to use LINQ2SQL.

Edit: Using the VS Express Database Explorer to create the tables, it's telling me that 4000 is the max size for nvarchar (doesn't seem to have varchar listed as an option). Is this just a limitation of SQLCE and an indication that I'll have to look into something else?

If it's true that this is a limitation of SQLCE does anyone have another recommendation? For a pet project, I'd have to be something free and preferably easy to setup (preferably both for me and the end-user, but more important that it's easy to setup for the end-user). The database will be local, and performance isn't too much of a concern.

A: 

Most SQL databases are smart enough to do this automatically for large VARCHARS and for TEXT columns. Rather than allocating space for a large column when the row is created, the data for each row is stored in such a fashion so that it only takes up slightly more space than the actual contents (rather than the maximum size).

Don Werve
Is nvarchar the most appropriate datatype for this? Visual Studio (Express) is telling me that 4000 is the max size for nvarchar with SQLCE and that seems like it might be too small for my needs. Should I look into another Sql provider for this?
Davy8
Not sure what you mean by the database being smart enough to do this automatically. If you try to send 5000 characters to an nvarchar(4000) it would seem to me that you will either see the data truncated or you will get a runtime error...
Gary.Ray
I mean smart enough to store the strings in a compact way, usually by breaking them into blocks or with size-prefix notation, with the first N bytes indicating the size of the payload (so 'mystring' is internally represented by '8mystring').
Don Werve
A: 

I never used SQL CE, but see if it supports the VARCHAR(MAX) data size. Basically, it stores large amounts of text (up to 2GB) outside the scope of the 800K row size limit, but also lets you use ' = ' and other WHERE clause operators (the TEXT data type only supported LIKE).

HardCode
+1  A: 

Have you done any studies on existing recipes? A varchar(4000) would give you around 400-500 words and I am pretty sure not many of the recipes in the many cookbooks I have has a description longer than that.

VarBinary would get you 8000 bytes, but if you are going to be doing any searching in the description field using varbinary could require casts or other operations that will incur a performance hit.

Finally, while I don't particularly like this, you could normalize descriptions into a different table which would allow you to set a one-to-many relationship and enable a recipe to have more than one description part which you would reassemble in the interface.

Gary.Ray
Good call on checking whether 4k is actually insufficient. I'll probably do that check in a little bit, but it's good info to know regardless.
Davy8
After some copy/paste from recipe sites into open office for a char count it seems that 4k should be plenty for 95-99% which is good enough for me.
Davy8
+1  A: 

Not necessarily recommended, but provided because they sprang to mind:

If the text is very rarely altered once stored, you might consider creating a new table that stores "lines" of text, something like:

recipe_id integer,
line_number integer,
line_text nvarchar(80)

Alternatively, if you don't need to search the text of the recipe, how about a simple compression algorithm? Huffman encoding is fairly effective on text and not horribly CPU-intensive.

Mike Woodhouse