ansaurus

Question

What is the best alternative to calling strlen() in my for loop condition in C?

Answer 1

+8 A:

The second one is usually preferred.

The other popular form is

for (char* p = something; *p; p++)
{
   // ... work with *p
}

Yet another one is

char* p = something;
char c;
while ((c = *p++))
{
    // ... do something with c
}

(the extra () around assignment are needed to make some suspicious compilers not issue a warning stating I might mean comparison inside while condition)

Indeed, strlen is quite slow, because it must go through the whole string looking for trailing 0. So, strlen is essentially implemented as

int s = 0;
while (*p++) s++;
return s;

(well, in fact a slightly more optimized assembler version is used).

So you ought to avoid using strlen if possible.

Vlad 2010-03-03 17:38:35

you need *p as the condition, not p;

nos 2010-03-03 17:39:55

@nos: I just added it.

T.J. Crowder 2010-03-03 17:41:31

indeed. just added one more version.

Vlad 2010-03-03 17:42:22

So, null char is interpreted as false then? Is null not interpreted as false also (ie, leaving the condition as the pointer, p, rather than *p) ?

Mithrax 2010-03-03 17:46:38

In C, there is no true/false, there is no bool type at all. int 0 means false, ints != 0 are true.

Vlad 2010-03-03 17:48:13

Yes. `'\0'` is the same as zero, which is the same as false.

Billy ONeal 2010-03-03 17:48:55

checking for `p` instead of `*p` was wrong, because the pointer to the last character is usually not a NULL pointer.

Vlad 2010-03-03 17:51:02

@Vlad: the pointer to the last character is *always* not a null pointer in standard C. 6.3.2.3/3 says, "... a null pointer, is guaranteed to compare unequal to a pointer to any object or function." Weird kernel modes that let you map address 0, when address 0 is a null pointer, are non-standard.

Steve Jessop 2010-03-03 18:29:04

@Steve: just for clarification: a pointer with bit-representation/address 0 may very well be valid and comparing unequal to null pointers and null pointer constants - ie even though it consist only of 0 bits, it will still compare unequal to `0`!; this is the reason why you can't use `calloc()` or `memset()` to get null pointers if you're restricted to the semantics of standard C and can't assume any other implementation details

Christoph 2010-03-03 18:41:37

Yes, the requirement is that null pointers don't compare equal to pointers to objects (and hence a pointer to any part of a string cannot be a null pointer), not that null pointers have any particular storage representation. All-0 is of course very common, and I used that example simply because it really happens, for example in the linux kernel.

Steve Jessop 2010-03-03 18:57:52

be warned, some might frown on you for using for.

IanNorton 2010-03-03 20:25:20

@IanNorton: who would frown on that? If the loop controls are all in the statement, for is better than while with the loop controls scattered around the code.

Jonathan Leffler 2010-03-03 20:55:03

@Steve: your `something` can be just a NULL pointer, therefore `p` _may_ be NULL in that context.

Vlad 2010-03-03 21:00:47

Answer 2

+3 A:

These are preferred:

for (int i = 0; str[i]; ++i)
for (char* p = str; *p; ++p)

Tronic 2010-03-03 17:42:24

Answer 3

+1 A:

Usually the second. If nothing else, the first one has to traverse the string twice: once to find the length, and again to operate on each element. On the other hand, if you have code already written strlen, it may be easier to just hoist the strlen call out of the loop and still get most of the benefit.

Jerry Coffin 2010-03-03 17:43:53

Answer 4

+3 A:

If some part of your loop can overwrite the NUL char at the end of your string, the version that calls strlen will still finish before then end of your buffer. The second version could overrun the buffer and party all over somebody else's memory. The strlen version is also easier to understand at a glance.

Gabe 2010-03-03 17:54:08

+1 for correct spelling of NUL.

Heath Hunnicutt 2010-03-03 17:55:47

For those who don't know, NULL is the null pointer, while NUL is the ASCII character represented by `'\0'`.

Gabe 2010-03-03 17:59:37

Answer 5

A:

The first approach looks like this for me :

while(..) { Test end of string }  /* strlen */
while(..) { Your Code }           /* processing */

While the second approach looks like :

while(..) { Your Code + Test end of string } /* both */

IMHO, both approach compute roughly the same number of operations, and I consider them equivalent. Also, strlen is quite optimized, as already mentioned, and well tested. Moreover, the second approach looks like premature optimization :) You'd better test/profile your code if necessary and then optimize (after all, it is only a linear algorithm).

You may however consider the second approach if the processing may possibly stop long before the end of the string (e.g., find the first occurence of a word).

coredump 2010-03-03 18:33:49

Actually, the first one is O(2n), the second is O(n). They are only roughly equivalent *when it doesn't matter*.

Paul Nathan 2010-03-03 18:36:53

Considering big O notation they are both O(n).My point is that if you unroll the loops, youhave (n*x) + (n*y) == n * (x+y) operationswhich is why I don't think the second approach istwice as fast as the first. Frequently, we want to optimize and make a complex single pass of processing instead of multiple simple iterations; it may be highly justified, depending on the actual problem, or it may not; I also think that optimizing such linear algorithms is not an efficient use of the available coding time.

coredump 2010-03-03 19:57:40

ansaurus

tags:

views:

answers:

What is the best alternative to calling strlen() in my for loop condition in C?

related questions