How do you reverse a string in C or C++ without requiring a separate buffer to hold the reversed string?
#include <cstdio>
#include <cstdlib>
#include <string>
void strrev(char *str)
{
if( str == NULL )
return;
char *end_ptr = &str[strlen(str) - 1];
char temp;
while( end_ptr > str )
{
temp = *str;
*str++ = *end_ptr;
*end_ptr-- = temp;
}
}
int main(int argc, char *argv[])
{
char buffer[32];
strcpy(buffer, "testing");
strrev(buffer);
printf("%s\n", buffer);
strcpy(buffer, "a");
strrev(buffer);
printf("%s\n", buffer);
strcpy(buffer, "abc");
strrev(buffer);
printf("%s\n", buffer);
strcpy(buffer, "");
strrev(buffer);
printf("%s\n", buffer);
strrev(NULL);
return 0;
}
This code produces this output:
gnitset
a
cba
std::reverse(str.begin(), str.end());
This is the simplest way in C++.
You use std::reverse
algorithm from the C++ Standard Library.
Evil C:
#include <stdio.h>
void strrev(char *p)
{
char *q = p;
while(q && *q) ++q;
for(--q; p < q; ++p, --q)
*p = *p ^ *q,
*q = *p ^ *q,
*p = *p ^ *q;
}
int main(int argc, char **argv)
{
do {
printf("%s ", argv[argc-1]); strrev(argv[argc-1]);
printf("%s\n", argv[argc-1]);
} while(--argc);
return 0;
}
(This is XOR-swap thing. Take care to note that you must avoid swapping with self, because a^a==0.)
Ok, fine, let's fix the UTF-8 chars...
#include <bits/types.h>
#include <stdio.h>
#define SWP(x,y) (x^=y, y^=x, x^=y)
void strrev(char *p)
{
char *q = p;
while(q && *q) ++q; /* find eos */
for(--q; p < q; ++p, --q) SWP(*p, *q);
}
void strrev_utf8(char *p)
{
char *q = p;
strrev(p); /* call base case */
/* Ok, now fix bass-ackwards UTF chars. */
while(q && *q) ++q; /* find eos */
while(p < --q)
switch( (*q & 0xF0) >> 4 ) {
case 0xF: /* U+010000-U+10FFFF: four bytes. */
SWP(*(q-0), *(q-3));
SWP(*(q-1), *(q-2));
q -= 3;
break;
case 0xE: /* U+000800-U+00FFFF: three bytes. */
SWP(*(q-0), *(q-2));
q -= 2;
break;
case 0xC: /* fall-through */
case 0xD: /* U+000080-U+0007FF: two bytes. */
SWP(*(q-0), *(q-1));
q--;
break;
}
}
int main(int argc, char **argv)
{
do {
printf("%s ", argv[argc-1]); strrev_utf8(argv[argc-1]);
printf("%s\n", argv[argc-1]);
} while(--argc);
return 0;
}
- Why, yes, if the input is borked, this will cheerfully swap outside the place.
- Useful link when vandalising in the UNICODE: http://www.macchiato.com/unicode/chart/
Also, UTF-8 over 0x10000 is untested (as I don't seem to have any font for it, nor the patience to use a hexeditor)
$ ./strrev Räksmörgås ░▒▓○◔◑◕●
░▒▓○◔◑◕● ●◕◑◔○▓▒░
Räksmörgås sågrömskäR
./strrev verrts/.
Non-evil C, assuming the common case where the string is a null-terminated char
array:
#include <stddef.h>
#include <string.h>
/* PRE: str must be either NULL or a pointer to a
* (possibly empty) null-terminated string. */
void strrev(char *str) {
char temp, *end_ptr;
/* If str is NULL or empty, do nothing */
if( str == NULL || !(*str) )
return;
end_ptr = str + strlen(str) - 1;
/* Swap the chars */
while( end_ptr > str ) {
temp = *str;
*str = *end_ptr;
*end_ptr = temp;
str++;
end_ptr--;
}
}
In the interest of completeness, it should be pointed out that there are representations of strings on various platforms in which the number of bytes per character varies depending on the character. Old-school programmers would refer to this as DBCS (Double Byte Character Set). Modern programmers more commonly encounter this in UTF-8 (as well as UTF-16 and others). There are other such encodings as well.
In any of these variable-width encoding schemes, the simple algorithms posted here (evil, non-evil or otherwise) would not work correctly at all! In fact, they could even cause the string to become illegible or even an illegal string in that encoding scheme. See Juan Pablo Califano's answer for some good examples.
std::reverse() potentially would still work in this case, as long as your platform's implementation of the Standard C++ Library (in particular, string iterators) properly took this into account.
Note that the beauty of std::reverse is that it works with char *
strings and std::wstring
s just as well as std::string
s
void strrev(char *str)
{
if (str == NULL)
return;
std::reverse(str, str + strlen(str));
}
If you're looking for reversing NULL terminated buffers, most solutions posted here are OK. But, as Tim Farley already pointed out, these algorithms will work only if it's valid to assume that a string is semantically an array of bytes (i.e. single-byte strings), which is a wrong assumption, I think.
Take for example, the string "año" (year in Spanish).
The Unicode code points are 0x61, 0xf1, 0x6f.
Consider some of the most used encodings:
Latin1 / iso-8859-1 (single byte encoding, 1 character is 1 byte and vice versa):
Original:
0x61, 0xf1, 0x6f, 0x00
Reverse:
0x6f, 0xf1, 0x61, 0x00
The result is OK
UTF-8:
Original:
0x61, 0xc3, 0xb1, 0x6f, 0x00
Reverse:
0x6f, 0xb1, 0xc3, 0x61, 0x00
The result is gibberish and an illegal UTF-8 sequence
UTF-16 Big Endian:
Original:
0x00, 0x61, 0x00, 0xf1, 0x00, 0x6f, 0x00, 0x00
The first byte will be treated as a NUL-terminator. No reversing will take place.
UTF-16 Little Endian:
Original:
0x61, 0x00, 0xf1, 0x00, 0x6f, 0x00, 0x00, 0x00
The second byte will be treated as a NUL-terminator. The result will be 0x61, 0x00, a string containing the 'a' character.
In case you are using GLib, it has two functions for that, g_strreverse() and g_utf8_strreverse()