views:

765

answers:

10

For example, in <ctype.h> there are functions like isalpha().

I want to know if writing an isalpha function on my own is faster than calling isalpha?


Thanks for all your instant replies! just want to make clearer to my question:

so even for the isalpha function? because you can simply pass a character and check if the character is between 'a' and 'z' || 'A' and 'Z'?

another question: when you include a std library like ctype.h and just call one function like isalpha, will the file(I mean all lines of code) be loaded? My concern is that the big-size will make program slower

+50  A: 

Unless you have specific reason to do so (e.g., you have a specific requirement not to use the standard library or you've profiled a very specific use case where you can write a function that performs better), you should always prefer to use a standard library function where one exists rather than writing your own function.

The standard library functions are heavily optimized and very well tested. In addition, the standard library that ships with your compiler can take advantage of compiler intrinsics and other low-level details that you can't portably use in your own code.

James McNellis
so even for the isalpha function?because you can simply pass a character and check if the character is between 'a' and 'z' || 'A' and 'Z'?another question:when you include a std library like ctype.h and just call one function like isalpha, will the file(I mean all lines of code) be loaded? My concern is that the big-size will make program slower.
draw
@draw: The compiler can (and often does) inline functions where they are called; the only way you can know whether your code is faster is to profile it. Only the functions that you use should be linked into your binary, though even if other functions are linked in, generally they should not adversely affect your program's performance.
James McNellis
@draw, that's likely what the stdlib function will do as well so yours won't be faster. And modern OS' most likely won't load the C runtime into every process, they'll just map in the shared memory segment that it's already loaded into (the first program may have to load it but that should happen pretty quickly after you boot).
paxdiablo
@James, I made a very inconsequential change (removed a space) to your answer since SO seems to think I've voted on this question before and wouldn't let me upvote without an edit >:-/ Thought I'd better explain in case you wondered WTH was going on.
paxdiablo
@paxdiablo: No problem at all. I was just diffing the revisions and thought "wow, there's someone out there who is pickier about spacing than I am." ;-)
James McNellis
@draw: To illustrate James's point, your proposed implementation of isalpha requires four comparisons and the implementation in glibc does a lookup into a table, a bitwise and, and a comparison against 0. So even for this simple of a function, the library version is probably faster.
Brian L
@paxdiablo thanks! I've learned a lot.
draw
Thanks to everyone!
draw
+2  A: 

Although it's most likely not going to be slower if you're careful writing it, I can almost guarantee that you're not going to make something more optimized than what's already there. The only case I can think of is if its a function and you do it repeatedly inline - but if its a macro, you're not going to beat it. Use the standard.

eruciform
+1  A: 

in many C/C++ enviroments (eg VisualC) the source of the 'C Runtime Library' (CRT) is available. Look at the code in the function CRT and then try to think "can you make it better?".

steelbytes
Would agree, except Microsoft isn't exactly the best example to say "you're not going to make it better." Any standards compliant implementation and not by Microsoft would serve a better example.
Michael Aaron Safyan
+6  A: 

The standard library functions are written by, presumably, very smart people, and have been thoroughly vetted, debugged and optimized. They've been tested perhaps millions of times over in every conceivable production environment. Chances are very good that your custom function will not be better or faster.

Rob
++ Good points, except I wouldn't assume the library writers are very smart. (If they were, they could be doing something more interesting.) I would assume they are smart enough to do a good job, and so are the people critiquing their work.
Mike Dunlavey
@Mike: Some of us enjoy writing libraries :-)
James McNellis
@Mike: I still think people writing libraries are pretty smart. The best programmers I knew took the library writing stuff like the only stuff who still entertain them.
Erik Escobedo
@James: @Erik: Yeah, me too, but I've met some people I would call very smart, but I don't think they'd call *me* that :-)
Mike Dunlavey
There are good reasons to use the C library, but performance and debugging are not among them. glibc tends to be one of the best performers and I can usually outperform it by a factor of 2-3 times on the first draft of a function and often 5-8 times after optimization (a few examples: strstr, mbrtowc, fgetc, etc.). glibc is also known to have intentional bugs (nonconformance) in how printf deals with %.10s and similar in multibyte locales.
R..
@R: You should consider contributing your improved functions to the glibc project. I'm surprised that their implementation of `strstr` or `fgetc` would be so inefficient. Is their code too generalized?
tomlogic
@tomlogic: glibc's `fgetc` and especially `mbrtowc` are slow because glibc implements both stdio and multibyte/wchar conversion with several layers of wrappers around underlying libraries that are each an order of magnitude more complicated than they need to be to do the job. And with Drepper in charge, getting any contributions accepted is virtually impossible. The eglibc fork (which Debian now uses) was created primarily because of his mismanagement. I actually plan to try getting some code included in eglibc once it's thoroughly tested.
R..
+6  A: 

Generally, you should always use the C libraries when possible. One real reason not to is when you are in an embedded environment and are EXTREMELY space limited (which is usually not the case, and virtually all embedded platforms provide C libraries for the platform).

An example may be that using the isalpha function may actually drag in an object file containing all of the is... functions and you don't need any of them (the object file is the typical minimum unit when linking although some linkers can go down to individual functions).

By writing your own isalpha, you can ensure that it, and only it, is incorporated into your final binary.

In some limited cases you may get faster speeds where you have a very specific thing you are wanting to do and the library is handling a more general case. Again, only needed if a particular loop is a bottleneck in the system. You may also want to choose a different speed/space tradeoff than that chosen by the library writer, an example being changing:

int isalpha (int c) {
    return ((c >= 'A') && (c <= 'Z')) || ((c >= 'a') && (c <= 'z'));
}

into:

int isalpha (int c) {
    static int map[256] = {0,0,0,0,...,1,1,1,...,0,0,0};
    return map[c & 0xff];
}

a (potentially) faster implementation at the cost of extra storage for the map (and you need to understand your execution environment since it's not portable).

Another reason not to use them is to provide a more secure way with dealing with things like strings where security/robustness is a CRITICAL factor. This will generally cost you a lot more time to prove the correctness.

Keith Nicholas
+1 - "you have a very specific thing you are wanting to do and the library is handling a more general case" - this is the only time I'll even consider it.
detly
@Keith, your answer was pretty much exactly what I wanted to say but I like to see examples. So I added them, hope you don't mind. Let me know if you're unhappy and I'll back them out.
paxdiablo
Thanks for your reply!
draw
Another point that has validity is that not all C compilers and libraries are as mature as gcc. There are still C compilers being made today for oddball embedded devices, where you might get some good mileage out of rolling your own function.
Paul Nathan
And if you improve on an embedded compiler's Standard C Library, consider submitting the code back to the company so they can incorporate it into their next release.
tomlogic
Your lookup table version is a LOT slower than the optimal implementation `((unsigned)c|32)-97<26`.
R..
@R..: that would depend on the hardware, but the integer implementation is certainly cuter.
Stephen Canon
I think you'll have a hard time finding ANY hardware where the table version is faster in a loop with a nontrivial distribution of characters, and probably even if it's called on the same character over and over...
R..
@R..: Actually, my Core2 duo laptop says otherwise. Calling both implementations in a tight loop to count the number of alpha characters in length-1024 buffers of random characters, the table-driven method takes 2/3rds the time of the arithmetic method (1134ns vs 1757ns per 1024 characters). That said, I still prefer the arithmetic method.
Stephen Canon
Are you allowing the compiler to use the i686 `cmov` instruction? Conditional jumps are rather slow. Check the generated asm and see that gcc's not doing something stupid, because I'm 90% sure the arithmetic version is faster if you generate the sane asm for it..
R..
@R..: The compiled code (with llvm-gcc) for the arithmetic sequence is branch free (actually it's quite clever, using a subtract with borrow instead of a cmov and add and then patching up the result once all the characters have been checked). Using gcc instead of llvm-gcc, the compiled code uses `setbe` (again, branch-free) and gets similar performance numbers; gcc does a terrible job with the table-driven method, though, and so the two methods compare about equal.
Stephen Canon
Yes, `sbb` is the best way to do it but I mentioned `cmov` because I didn't think gcc would be smart enough to ever use `sbb`. Nice to know llvm does it. :-)
R..
+1  A: 

The only time I do not use something in the standard library is where that something is missing unless a particular extension of that library is turned on.

For instance, to get asprintf() in GNU C, you need to turn on _GNU_SOURCE prior to including <stdio.h>. There was even a time when strdup() was hit or miss in <string.h>.

If I heavily depend on these extensions, then I try to include them in my code base so that I don't have to write kludges to work around their absence.

Then there are the rare instances when you want to roll your own version of something that gives, for example, POSIX behavior (or something else) by default in a better way.

Other than that, implementing something from stdc on your own seems a little bit silly, beyond the value of being a good learning exercise.

Tim Post
A: 

A few lines or one line of C code does not necessarily translate into the simplest and fastest solution. a memcpy of while(--len) *d++ = *s++; is definitely the slowest. The libraries are generally well done and fast, and you may have a hard time improving upon them. Places where you may see a gain are on specific platforms where you know something about the platform that the compiler doesnt. For example the target may be a 32 bit processor but you may know that 64 bit aligned accesses are faster and may want to modify the library to take advantage of that special situation. But in general for all platforms for all targets you likely will not do better, target specific optimizations have been written for popular targets and are in the C library for the popular compilers.

dwelch
Actually your example of a "slow" `memcpy` is likely the fastest for small buffers, which is arguably where performance is most critical. Most of optimized versions have sufficient overhead that they perform horribly if they happen to get called with `len<8` or so. Of course it's possible to minimize this, but I've found glibc is notoriously bad about optimizing for the asymptotic cases without even thinking about ensuring reasonable performance for common small-`n` cases.
R..
@R..: a reasonable compiler will inline code sequences for `memcpy` calls when the length is a small compile-time constant. Beyond that, a well written memcpy will be fast even for very small buffers--I have implemented a platform's `memcpy`; my experience is that on a sane architecture, one can match or exceed the speed of the simple character-copy loop by the time the size reaches 4.
Stephen Canon
@R.. most systems these days use something other than an 8 bit bus, 8 bit copies are slow, depending on the bus and wait states, what the cache does on each of the writes, etc, it doesnt take very many, perhaps even a few to a dozen where a few instructions to figure out alignment of the bytes to the memory space, copy the odd bytes, copy the memory bus width objects, then copy the odd bytes, can be faster than just a few instructions to do it all in bytes. At the same time, agreed, small buffers suffer from memcpy. Which is why I would avoid memcpy for small copies anyway.
dwelch
so a couple of instructions to compare the length to the magic limit where too small is slower and branch to the byte copy loop instead of the optimized copy loops, is a memcpy improvement that someone might want to add to a standard C library for your particular target.
dwelch
What I do something like (pseudocode): `if (src_low_bits == dest_low_bits) { while(src_low_bits while(len>=word_size) copy_word; } while(len) copy_byte;`
R..
that is what c libraries do already. depending on the target, c library, etc.
dwelch
@Stephen: Yeah, constant sized memcpy will blow the pants off strcpy or variable length memcpy. Forex: some may think it's faster to fill 64 byte string by writing the string then the \0's, but it is much faster to write all 64 0's, then write the string. Then once you have that copy it around with a 64 byte memcpy always. Itanium2 can do it in something like 1 machine cycle.
Zan Lynx
+11  A: 

isalpha does not merely check if its argument is in the ranges A-Z, a-z. Quoting the C standard (§7.4.1.2):

The isalpha function tests for any character for which isupper or islower is true, or any character that is one of a locale-specific set of alphabetic characters for which none of iscntrl, isdigit, ispunct, or isspace is true.

In all likelihood you can write a more limited version (as you suggest) that is faster for the subset of cases that it handles, but it won't be the isalpha function. Library routines exist not only to be efficient, but to be complete and correct. Efficiency actually turns out to be the easy part; getting all the edge cases right is where the hard work comes in.


Note also, if you're going to write an optimized version that targets English/ASCII, you can do it rather more efficiently then what you suggested, either with the lookup table that someone else suggested, or my personal preference (edited to fix an error caught by R..)

int isalpha(int c) {
    return ((unsigned int)(c | 32) - 97) < 26U;
}
Stephen Canon
Your cast is in the wrong place; this code has undefined behavior due to signed integer overflow issues. You want to use the following:`return ((unsigned)c|32)-'a' < 26;` Casting `c` to `unsigned` *before* you do anything with it is critical.
R..
@R..: I was thinking only in terms of `c` that come from actual ASCII `char` values, but you're correct that it invoked undefined behavior for extreme negative integer values of `c`. I updated my answer.
Stephen Canon
Note that this is actually one of the only sane reasons to write your own isalpha: to check specifically for ASCII letters. (Admittedly, there are few applications where you would be able to get unicode letters and still would not want them recognized as "Letters".)
David X
@DavidX: the problem is not that you don't want Unicode characters, but that `isalpha` interprets things according to your locale's encoding, whereas (for example) text coming over the internet in an email or web page needs to be interpreted not in your locale's encoding but the encoding specified in its headers. Thus you can't use `isalpha` etc.
R..
@R.. true, i was thinking of source code, but web pages are probably a more common use. It would be nice to have a `int isalpha_locale(int c, const char* locale);` available.
David X
POSIX 2008 has `isalpha_l` to which you pass a locale_t object. This makes it possible either to build a C locale object for when you need the functions for ASCII parsing purposes, or load the user's locale in a locale object for the cases where you want to process natural-language text in the user's locale. However, the only standard locales are "C"/"POSIX" and "" referring to the environment-specified locale. Availability (and naming) of locales is system-specific, and a modern system in the US is likely only to have en_US.UTF-8.
R..
Also, for modern Unicode usage, you really need to process strings, not bytes/characters. This is not just because of UTF-8 (or UTF-16) encoding, but also combining marks, multi-character case mappings, etc.. So even if there were a standardly accessible "Unicode" locale it would be of extremely limited usefulness for correct Unicode text processing...
R..
R..
@R..: that's what happens when you write code on a webpage after not sleeping for 20 hours and never test it =)
Stephen Canon
@Stephen Canon, it's also why my version with char literals might be a good choice. :-)
R..
@R.. re `isalpha_l`, neat, didn't know about that.
David X
+4  A: 

There are already a bunch of answers here, but none except Stephen Canon's address the most important part: the different semantics. This is the most important factor in choosing which functions to use.

The standard C library isalpha etc. functions are specified to work according to the current locale. If you leave the locale as the default "C" locale (by failing to call setlocale), they have very predictable behavior, but this precludes using the only standardized method for the application to detect and use the system's/user's preferred character encoding, number formatting, message language, and other localization preferences.

On the other hand, if you implement your own isalpha (the optimal implementation is ((unsigned)c|32)-'a'<26 or if you like code that's more self-documenting, ((unsigned)c|('A'^'a')-'a'<='z'-'a'), it always has very predictable behavior regardless of locale.

I would go so far as to attach a considered harmful to using the standard isalpha, etc. functions for anything except naive text processing of text assumed to be in the user's locale format. These functions are especially unsuited for parsing configuration files, text-based network transactions, HTML, programming language sources, etc. (The one exception is isdigit which ISO C requires to be equivalent to return (unsigned)c-'0'<10;.) On the other end of the spectrum, if you're writing an application with advanced natural language text handling (like a word processor or web browser) it will need to have much more advanced character property handling than the C library can provide and you should be looking for a good Unicode library.

R..
+1  A: 

Interestingly, the implementation of isalpha() in your question is slower that the most common implementation provided with C libraries 30 years ago. Remember, this is a function that will be used at a critical inner loop of your average C compiler. :)

I'll admit that current library implementations are probably marginally slower than they used to be, because of character set issues that we have to deal with today.

Darron