ansaurus

Question

Why is this C-style code 10X slower than this obj-C style code?

Answer 1

+3 A:

Having nothing to do with performance, your C code has an error in it. buf += sizeof(char) should simply be buf++. Pointer arithmetic always moves in units the size of the type. It worked fine in this case because sizeof(char) was 1.

Jeff Mc 2009-09-09 03:25:08

Which has nothing to do with the question at hand. To the compiler, sizeof(1), sizeof(char), and 1 are absolutely identical. Thus, though a bug, it cannot be the source of the performance discrepancy.

bbum 2009-09-09 03:28:18

Oh, come on. That was a perfectly factual comment, and a useful style note; needless use of sizeof is always a great clue that the code you are reading has pointer bugs in it. And your correction is factually incorrect to boot. sizeof(1) == sizeof(int) == 4 on most platforms

Andy Ross 2009-09-09 03:56:40

Factual, yes, but it has no relevance to the question at hand. (Yes, I made a factual error in sizeof(1), too).

bbum 2009-09-09 05:21:12

It's helpful none the less and deserves a vote.

Skurmedel 2009-09-09 10:47:29

Answer 2

+10 A:

Measure, measure, measure.

Measure the code with the Sampler instrument in Instruments.

With that said, there is an obvious inefficiency in the C code compared to the Objective-C code.

Namely, fast enumeration -- the for(x in y) syntax -- is really fast and, more importantly, implies that splitPoints is an array or set that contains a bunch of data that has already been parsed into individual objects.

The strchr() call in the second loop implies that you are parsing stuff on the fly. In and of itself, strchr() is a looping operation and will consume time, more-so as the # of characters between occurrences of the target character increase.

That is all conjecture, though. As with all optimizations, speculation is useless and gathering concrete data using the [rather awesome] set of tools provided is the only way to know for sure.

Once you have measured, then you can make it faster.

bbum 2009-09-09 03:25:26

For what it's worth, I can't imagine `strchr()` being terribly inefficient. It's a simple function and should be fairly well optimized. That said, it can still be bad.

Chris Lutz 2009-09-09 03:46:52

`strchr()` is fast. The difference is that the array implies that parsing is already done and the loop w/`strchr` adds the cost of doing the parse in the loop.

bbum 2009-12-31 22:13:50

Answer 3

A:

Vectorization can be used to speed up C code.

Example:

Even faster UTF-8 character counting

(But maybe just try to avoid the function call strchr() in the loop condition.)

2009-09-09 10:41:30

Answer 4

+1 A:

Obj C code looks like it has precomputed some split points, while the C code seeks them in each iteration. Simple answer? If N is the length of buf and M the number of your split points, it looks like your two snippets have complexities of O(M) versus O(N*M); which one's slower?

edit: Really amazed me though, that some would think C code is axiomatically faster than any other solution.

Michael Foukarakis 2009-09-09 10:46:08

The C code is only looking for the next '[' in each iteration of the loop, not looking through the entire length of buf for every splitpoint, right?

mahboudz 2009-09-09 20:21:46

No, but worst case scenario is that it'll have to traverse the entire string. So it's directly dependent on the length of the string, not constant.

Michael Foukarakis 2009-09-10 04:39:27

Gah, I meant "yes, but". Need coffee.

Michael Foukarakis 2009-09-10 04:39:57

ansaurus

tags:

views:

answers:

Why is this C-style code 10X slower than this obj-C style code?

related questions