views:

127

answers:

6

A lot of Database schemas seem to follow the following standard:

(2^n)-1 for large fields:

varchar(511)
varchar(255)
varchar(127)

...then (2^n) for smaller ones

varchar(64)
varchar(32)
varchar(16)
varchar(8)

I understand why numbers of (2^n)-1 are used, what I don't understand is why it is not necessary to continue the trend down to the small fields.

E.g.

varchar(63)
varchar(31)
varchar(15)
varchar(7)

Is there a reason for this or is it just that the returns have diminished too far?

+3  A: 

I can't say as I've seen the "-1" trend used differently for smaller / larger fields. However I think its all nothing more than developer/dba OCD. Always wanting "round" numbers when they should really be making field length decisions based on business rules rather than "prettiness"

Robin Day
A: 

Actually, that'S the first time I heard of such a "trend". My varchar-fields are always as long as I need them to be.

I would say that there's no particular reason behind it. Just the preference of the developer.

Maximilian Mayerl
A: 

I can not imagine for a moment why they are sizing the fields that way on either count 2^N or 2^N-1. From a SQL Server perspective, it sounds more like a misconception over how SQL is storing that data at the page level than a specific reason.

I would pay more attention to the types, potential off page storage (SLoB / LoB) and rows/page + meeting the business needs with sizing, than a formula based on using 2^N etc

Andrew
+4  A: 

I think it's just programmers picking a number out of the air when it doesn't really matter.

If you had to put a limit on "notes" field for example a non programmer would probably pick 100 or 200 characters as a round number whereas a programmer might think of 127 or 255 as a round number.

John Burton
+2  A: 

I remember the old times, when using 2^n Length was better vor alignment of blocks on disk or memory. Aligned block were faster. Today "Block" sizes are bigger and memory and disk are fast enough to ignore the alignment, exept for very large blocks that ist. (Whatever "very large" means today....)

Nowadays it is just traditional to do so.

And another reason my be the famous saying: There are only 10 type of people: Those who can binary and the others.

And 2^n -1 are candidates for mersenne primes. so its geeky too...

WegDamit
For the 2^n -1 Problem:Many programming languages use a null byte to terminate a string value (Pascal was one that doesn't, but stored the length at the first byte of the string), so you need one byte more to store the string.So this could be an reason for using one less, too.
WegDamit
+1  A: 

Not only have the returns diminished, but if you set string types with arbitrary limits you're much more likely to cause a problem than solve one.

If it's really a business constraint, then use TEXT type (or similar) and a CHECK constraint on the length, and pick the number according to the business constraint you're trying to enforce.

The DBMS will probably implement the storage for the TEXT type identically to varchar(x) anyway. If it doesn't, and you really care, you should investigate the internal storage strategies of your particular system, and consider whether there's any benefit to varchar or not.

Jeff Davis