tags:

views:

248

answers:

5

I've read that it is bad practice to call strlen() in my for loop condition, because this is an O(N) operation.

However, when looking at alternatives I see two possible solutions:

int len = strlen(somestring);  
for(int i = 0; i < len; i++)  
{

}

or...

for(int i = 0; somestring[i] != '\0'; i++)  
{

}

Now, the second option seems like it might have the advantage of 1) not declaring an unnecessary variable, and 2) should the string length be modified in the loop it should still reach the end as long as the length isn't < i.

However, I'm not sure. Which one of these is standard practice among C programmers?

+8  A: 

The second one is usually preferred.

The other popular form is

for (char* p = something; *p; p++)
{
   // ... work with *p
}

Yet another one is

char* p = something;
char c;
while ((c = *p++))
{
    // ... do something with c
}

(the extra () around assignment are needed to make some suspicious compilers not issue a warning stating I might mean comparison inside while condition)

Indeed, strlen is quite slow, because it must go through the whole string looking for trailing 0. So, strlen is essentially implemented as

int s = 0;
while (*p++) s++;
return s;

(well, in fact a slightly more optimized assembler version is used).

So you ought to avoid using strlen if possible.

Vlad
you need *p as the condition, not p;
nos
@nos: I just added it.
T.J. Crowder
indeed. just added one more version.
Vlad
So, null char is interpreted as false then? Is null not interpreted as false also (ie, leaving the condition as the pointer, p, rather than *p) ?
Mithrax
In C, there is no true/false, there is no bool type at all. int 0 means false, ints != 0 are true.
Vlad
Yes. `'\0'` is the same as zero, which is the same as false.
Billy ONeal
checking for `p` instead of `*p` was wrong, because the pointer to the last character is usually not a NULL pointer.
Vlad
@Vlad: the pointer to the last character is *always* not a null pointer in standard C. 6.3.2.3/3 says, "... a null pointer, is guaranteed to compare unequal to a pointer to any object or function." Weird kernel modes that let you map address 0, when address 0 is a null pointer, are non-standard.
Steve Jessop
@Steve: just for clarification: a pointer with bit-representation/address 0 may very well be valid and comparing unequal to null pointers and null pointer constants - ie even though it consist only of 0 bits, it will still compare unequal to `0`!; this is the reason why you can't use `calloc()` or `memset()` to get null pointers if you're restricted to the semantics of standard C and can't assume any other implementation details
Christoph
Yes, the requirement is that null pointers don't compare equal to pointers to objects (and hence a pointer to any part of a string cannot be a null pointer), not that null pointers have any particular storage representation. All-0 is of course very common, and I used that example simply because it really happens, for example in the linux kernel.
Steve Jessop
be warned, some might frown on you for using for.
IanNorton
@IanNorton: who would frown on that? If the loop controls are all in the statement, for is better than while with the loop controls scattered around the code.
Jonathan Leffler
@Steve: your `something` can be just a NULL pointer, therefore `p` _may_ be NULL in that context.
Vlad
+3  A: 

These are preferred:

for (int i = 0; str[i]; ++i)
for (char* p = str; *p; ++p)
Tronic
+1  A: 

Usually the second. If nothing else, the first one has to traverse the string twice: once to find the length, and again to operate on each element. On the other hand, if you have code already written strlen, it may be easier to just hoist the strlen call out of the loop and still get most of the benefit.

Jerry Coffin
+3  A: 

If some part of your loop can overwrite the NUL char at the end of your string, the version that calls strlen will still finish before then end of your buffer. The second version could overrun the buffer and party all over somebody else's memory. The strlen version is also easier to understand at a glance.

Gabe
+1 for correct spelling of NUL.
Heath Hunnicutt
For those who don't know, NULL is the null pointer, while NUL is the ASCII character represented by `'\0'`.
Gabe
A: 

The first approach looks like this for me :

while(..) { Test end of string }  /* strlen */
while(..) { Your Code }           /* processing */

While the second approach looks like :

while(..) { Your Code + Test end of string } /* both */

IMHO, both approach compute roughly the same number of operations, and I consider them equivalent. Also, strlen is quite optimized, as already mentioned, and well tested. Moreover, the second approach looks like premature optimization :) You'd better test/profile your code if necessary and then optimize (after all, it is only a linear algorithm).

You may however consider the second approach if the processing may possibly stop long before the end of the string (e.g., find the first occurence of a word).

coredump
Actually, the first one is O(2n), the second is O(n). They are only roughly equivalent *when it doesn't matter*.
Paul Nathan
Considering big O notation they are both O(n).My point is that if you unroll the loops, youhave (n*x) + (n*y) == n * (x+y) operationswhich is why I don't think the second approach istwice as fast as the first. Frequently, we want to optimize and make a complex single pass of processing instead of multiple simple iterations; it may be highly justified, depending on the actual problem, or it may not; I also think that optimizing such linear algorithms is not an efficient use of the available coding time.
coredump