tags:

views:

123

answers:

6

I tend to make the length of character strings some power of two (16, 32, 64). Is there any optimaztion benefit in doing this for objects of type string such as a string variable, a collection of strings, or a column in a database of type string? This is in a .net/sql server environment.

+1  A: 

No. What would you be doing with the chunk of string you aren't using because its just padding. The cost of this wastage will be significant compared with any saving there might be with an attempt to align the strings. ITs very doubtful that such lengths would have any benefit anyway.

AnthonyWJones
A: 

This is an area where optimization may not be that beneficial. I would define the lengths as needed and then come back later and optimize the lengths if need be. I think you will find the default handling of string lengths to be sufficient.

Andrew Hare
A: 

No. The power-of-two sizing optimization comes from the very dawn of the era of databases and had to do with how data was aligned on disk and in memory. Today, it's a vestigial behavior that gains no advantage.

Jekke
+3  A: 

Since .NET strings aren't null-terminated, you'd have to be very clever to actually consume the perfect number of characters in every single string.

String message = "hello world!!!!!"; // Exactly 16 chars

Besides, the power-of-two sizing of strings is only important when your implementation uses "malloc" to perform memory allocations. It's a memory-allocation strategy which says "my individual bits and pieces of memory will fit better into the heap, with less wasted space, if they all have power-of-two sizes".

But .NET doesn't use malloc to allocate memory. Instead, all heap memory is allocated by incrementing the heap pointer. When the GC frees memory later, it will perform heap compaction, so that all new memory comes from the end, and it never needs to find a small chunk of memory within a fragmented heap.

benjismith
+2  A: 

For the column in a database: be aware of SQL's 8kb data pages. The smaller your rows, the more rows you can fit on each datapage. The more rows you can fit in each datapage, the faster those rows can be read (fewer pages means less IO). This applies to both tables - and indexes.

Here's some more information from Wikipedia.

David B
+1  A: 

Strings in C#/.Net are immutable so there's no point (or any way) to preallocate space to hold more characters when constructing a string. If you append to a string you get a new string back, it creates new space to hold the entire new string and doesn't reallocate. As far as SQL columns, you should make them the exact length of the string if you know it in advance (char(N)) or use varying character data (varchar(N)), with N chosen as suitable maximum. I don't see any point in keeping these a power of two -- SSMS defaults to 50 when you create a varchar column, so apparently neither does Microsoft.

The one place that preallocating may make a difference is in something like StringBuilder or in preallocating the size of a collection. Again, it should be sized with the goal of it not having to be resized, but close to its actual usage if known. If not known, then either skip the initial sizing or make it large enough to hold most of the cases.

tvanfosson