tags:

views:

307

answers:

8

Hi,

I am trying to code a trival interview question of reversing a string.

This is my code:

#include <string.h>

char* rev( char* str)
{
    int i,j,l;

    l = strlen(str);

    for(i=0,j=l-1; i<l/2 ; i++, j--)
    {
        str[i] = (str[i] + str[j]);
        str[j] = str[i] - str[j];
        str[j] = str[i] - str[j];
    }

    return str;
}

int main()
{
    char *str = " hello";
    printf("\nthe reverse is %s ...", rev(str));

    return 1;
}

Basically, this one gives a segmentation fault.

I have following questions:

  1. I get segmentation fault probably because, the characters add up to something not defined in ascii and hence I cannot store them back as characters, I am using www.codepad.org [I wonder if it supports just ascii !!] . Is my understanding correct or there is something else to it.

  2. How do I correct the problem , for the same platform [I mean swapping in place for codepad.org]

  3. Here I have to use an additional integer l to calculate length. So to save a single char space by swapping in place .. I am using an extra int !!! .. just to impress the inteviewer :) ... Is this approach eve worth it !!!

  4. This one is for those who are interested in writing unit tests/API tests . I want to have a robust implementation so what can be possible test cases. I assume that if interviewer asks such a simple question .. he definitely wants some very roboust implementation and test cases. Few that I thought off:

    • passing empty strings passing integer

    • strings passing integer array instead of char array.

    • very long string ,

    • single char string string of special chars.

Any advise/suggestions would be helpful.

+7  A: 

Use a temporary variable rather than your approach for the swap. The compiler will probably use a register for the temporary variable due to optimizations.'

Either way, you implemented the swap algorithm wrong. It should be

str[i] = str[i] + str[j];
str[j] = str[i] - str[j];
str[i] = str[i] - str[j];
mathepic
Dang, didn't see your edit while writing my extremely long response.
Artelius
+8  A: 

This line:

char *str = " hello";

Probably points to read-only memory. Try this:

char str[] = " hello";

(You have some other bugs too, but this change will fix your segfault).

Carl Norum
Hi Carl, Thanks for the reply. That worked !!. Is that specific to codepad.org .. or thats a general rule ! .. I thought .. *str or st[] .. both are base pointers
p1
That is a general C rule. The way I understand it, codepad.org is just like a standard Linux computer using gcc, you shouldn't worry too much about the difference.
HalfBrian
A: 

as for testing:

  1. null argument
  2. empty string argument
  3. length 1 string argument
  4. various other lengths - perhaps one long string

you can of course implement the test method with the following strategy:

  1. a generic verify method

    verifyEquals( expected, actual ) { ... }

  2. a test method with various cases:

    testReverse() { verifyEquals(NULL, rev(NULL)); verifyEquals("", rev("")); verifyEquals("a", rev("a")); verifyEquals("ba", rev("ab")); verifyEquals("zyx", rev("xyz")); verifyEquals("edcba", rev("abcde")); }

You can also refactor the swap "algorithm" into a separate procedure and unit test it as well.

LES2
A: 

I get segmentation fault probably because, the characters add up to something not defined in ascii and hence I cannot store them back as characters

I don't think so. They're all just numbers in C (albeit only 1 byte long), but you shouldn't have any problem there.

I think (but I'm not sure) that the problem is with this:

char *str = " hello";
printf("\nthe reverse is %s ...", rev(str));

What you're actually doing is creating the char array " hello", which is a constant array. That means, basically, that you're not supposed to change it. When you call rev, it actually changes the array in-place, so it's trying to assign new values to a constant char.

Since you do char* str = "hello", you're actually casting "hello" to an unsigned char, so this isn't treated as a compile-time error. But because "hello" is what's called a "string literal", it's being created as part of the executable file itself, i.e. it's not in the memory your program can freely change. That is why you're actually getting the run-time seg-fault, and not a compile-time error (although you should probably be getting a warning about this).

Edan Maor
Actually, there is a problem with overflows in signed arithmetic. See this message: http://lists.gforge.inria.fr/pipermail/frama-c-discuss/2009-June/001235.html
Pascal Cuoq
+2  A: 

If you are going to implement the swap of two characters without a temporary variable (which is a neat trick but not something that you should actually use in practice), it would be prudent to either use the "bitwise exclusive or" instead of addition/substraction, or use unsigned char instead of char, because overflow in signed arithmetic is undefined in the C99 standard, and guess what, gcc started to make use of this undefinedness for optimization purposes. I was just ranting about another case of unwanted optimization in another question.

Pascal Cuoq
+3  A: 

Where to start...

OK, first you should be aware that your routine reverses a string in place, in other words changes are made to the original buffer.

This means that you could do

int main()
{
    char str[] = "hello";
    rev(str);
    printf("\nthe reverse is %s ...", str);

    return 0;
}

and the string would be reversed.

The other alternative is to create a new string that is a reversed copy of the original string. The algorithm is somewhat different, you should be able to do this too.

Next point:

    str[i] = (str[i] + str[j]);
    str[j] = str[i] - str[j];
    str[j] = str[i] - str[j];

is broken. It should be

    str[i] = str[i] + str[j];
    str[j] = str[i] - str[j];
    str[i] = str[i] - str[j];

But, as ~mathepic said, you should do this instead:

    temp = str[i];
    str[i] = str[j];
    str[j] = temp;

Also: codepad makes it difficult to debug your code. Install a compiler and debugger (such as gcc and gdb) on your own computer.

the characters add up to something not defined in ascii and hence I cannot store them back as characters, I am using www.codepad.org [I wonder if it supports just ascii !!] . Is my understanding correct or there is something else to it.

In most C implementations (those that run on 32-bit PCs anyway), a char is an 8-bit integer. An int is a 32-bit integer. When you add or subtract two chars and the result is more than 8 bits, it will "wrap around" to some other value, but this process is reversible.

For instance, 255 + 1 gives 0, but 0 - 1 = 255. (Just an illustrative example.) This means that "I cannot store them back as characters" is not the problem here.

I want to have a robust implementation

You want to show that you take into account the costs and benefits of different design choices. Perhaps it is better to cause a segmentation fault if your routine is supplied with a NULL, because this very quickly alerts the programmer of a bug in his code.

passing empty strings

You must make sure your code works in a case like that.

passing integer passing integer array instead

You can't pass integers, or an int [] to a function expecting a char *. In C, you cannot tell whether a char * really is a string or something else.

single char string

Make sure your routine works for a single char string, also for both strings with an odd number and an even number of chars.

string of special chars

There are no special chars in C (except, by convention, the null terminator '\0'). However, multi-char sequences are something that must be considered (reversing a UTF-8 string is different from reversing a regular string). However if the question does not specify I don't think you should be concerned about this.

Three final points:

  • In main(), return 1; usually indicates your program failed. return 0; is more common but return EXIT_SUCCESS; is best, though you may need to #include <stdlib.h>.
  • Consider using more descriptive variable names.
  • Consider making a strnrev() function, similar to the strncpy() and similar functions, where the function will not go beyond n characters if the null terminator is not found there.
Artelius
Note: I know that integer overflow is undefined, but *in practice*, when dealing with addition and subtraction it just wraps around. Didn't want to overcomplicate the issue.
Artelius
+5  A: 

Page 62 of Kernighan & Ritchie's The C Programming Language shows an algorithm for in-place string reversal with a temporary variable.

Similar to this:

char* rev_string(char* const str)
{
  int i, j;
  char tmp;
  for(i = 0, j = strlen(str)-1; i < j; i++; j--)
  {
    tmp = str[i];
    str[i] = str[j];
    str[j] = tmp;
  }
  return str;
}

This algorithm is easier to understand than the one without a temporary variable, imho.

As for item #3 in your list of questions:

As an interviewer, I would want to see simple, clear, and well structured code. That's impressive. Trickery will not impress me. Especially when its comes to premature optimization. BTW, my solution reverses the string in place with one additional char instead of an int. Impressive? :)

And item #4:

One other test case would be an unterminated string. Is your function robust enough to handle this case? Your function will only be as robust as the least robust part of it. Passing an unterminated string into my solution causes a Segmentation Fault, due to strlen reporting an incorrect string length. Not very robust.

The important point about robustness is, your code might be robust but you have to make sure all other external functions you use are, too!

cschol
+1 for no trickery.
Thanatos
A: 

Thanks all for the reply. Here is the code with changes everyone suggested:

#include <string.h>

char* rev( char* str)
{

int start ,end ,len;

    len = strlen(str);

    for(start =0,end =len-1; start <len/2 ; start ++, end --)
    {
        str[start ] = str[start ] + str[end ];
        str[end ] = str[start ] - str[end ];
        str[start] = str[start ] - str[end ];
    }

    return str;
}

int main()
{

   char str[] = " hello there !";

printf("\n the reverse string is %s ...", rev(str));

    return 1;
}

The segmentation fault was bec *str was pointing to read only memory , change it to str[]. Thanks Carl Norum for pointing that out.

  • Any test cases [specifically for API testing] ?
p1