views:

203

answers:

10

The code below is said to give a segmentation violation:

#include <stdio.h> 
#include <string.h> 

void function(char *str) {
   char buffer[16];

   strcpy(buffer,str);
}

int main() {
  char large_string[256];
  int i;

  for( i = 0; i < 255; i++)
    large_string[i] = 'A';

  function(large_string);
  return 1;
}

It's compiled and run like this:

gcc -Wall -Wextra hw.cpp && a.exe

But there is nothing output.

NOTE

The above code indeed overwrites the ret address and so on if you really understand what's going underneath.

The ret address will be 0x41414141 to be specific.

Important This requires profound knowledge of stack

+6  A: 

You're just getting lucky. There's no reason that code has to generate a segmentation fault (or any other kind of error). It's still probably a bad idea, though. You can probably get it to fail by increasing the size of large_string.

Carl Norum
Well, since large_string isn't null terminated, who knows when strcpy actually stop copying.
KTC
@KTC, sure, but it's *guaranteed* to copy more than `sizeof(buffer)`, so what's the difference?
Carl Norum
No, the size is enough to overwrite the *ret* address
@user198729, maybe, maybe not. That's the magic of undefined behaviour. Maybe the return address is in a register?
Carl Norum
There is no magic if you really understand something,seems you don't know the physical positions of *buffer* *sfp* *ret* **large_string*
@user198729, if you know everything about how the system works, why are you claiming there's a problem? Go step through it in a debugger and *see* why it works when you think it should fail.
Carl Norum
It should give a segmentation fault according to my knowledge,but there isn't.It's in the assembly level so debugger won't help.
@user198729, what kind of debugger do you have that won't let you step through disassembly?
Carl Norum
In fact there is no debugger by hand now:(So I temporary prefer to solve it by analysis
@user198729, what architecture? What's the calling convention? Answering those two questions will give you the answer to your problem.
Carl Norum
x86,ABI's calling convention.
@user198729: if this is on windows, ollydbg will happily help you trace the issue. If it is linux, then you can use edb (http://codef00.com/projects.php#debugger)
Evan Teran
+1  A: 

Its UB ( undefined behavior). Strcpy might have copied more bytes into memory pointed by buffer and it might not cause problem at that moment.

aJ
+1  A: 

It's undefined behavior, which means that anything can happen. The program can even appear to work correctly.

It seem that you just happen to not overwrite any parts of memory that are still needed by the rest of the (short) program (or are out of the programs address space/write protected/...), so nothing special happens. At least nothing that would lead to any output.

sth
A: 

There may be anything in your 'char buffer[16]', including \0. strcpy copies till it finds first \0 - thus not going above your boundary of 16 characters.

Glorphindale
strcpy(des, src)The code is copying from *large_string* to *buffer*, not the other way round...
KTC
Oops :) Gotta be more careful with documentation reading.
Glorphindale
+1  A: 

There's a zero byte on the stack somewhere that stops the strcpy() and there's enough room on the stack not to hit protected page. Try printing out strlen(buffer) in that function. In any case the result is undefined behavior.

Get into habit of using strlcpy(3) family of functions.

Nikolai N Fetissov
+3  A: 

Probably in your implementation buffer is immediately below large_string on the stack. So when the call to strcpy overflows buffer, it's just writing most of the way into large_string without doing any particular damage. It will write at least 255 bytes, but whether it writes more depends what's above large_string (and the uninitialised value of the last byte of large_string). It seems to have stopped before doing any damage or segfaulting.

By fluke, the return address of the call to function isn't being trashed. Either it's below buffer on the stack or it's in a register, or maybe the function is inlined, I can't remember what no optimisation does. If you can't be bothered to check the disassembly, I can't either ;-). So you're returning and exiting without problems.

Whoever said that code would give a segfault probably isn't reliable. It results in undefined behaviour. On this occasion, the behaviour was to output nothing and exit.

[Edit: I checked on my compiler (GCC on cygwin), and for this code it is using the standard x86 calling convention and entry/exit code. And it does segfault.]

Steve Jessop
I'm using GCC+MinGW
+2  A: 

You're compiling a .cpp (c++) program by invoking gcc (instead of g++)... not sure if this is the cause, but on a linux system (it appears your running on windows due to the default .exe output) it throws the following error when trying to compile as you have stated:

/tmp/ccSZCCBR.o:(.eh_frame+0x12): undefined reference to `__gxx_personality_v0' collect2: ld returned 1 exit status

HopefullyHelpful
Thank you for providing the output in linux,+1:)
+1  A: 

You can test this in other ways:

#include <stdlib.h>
int main() {
    int *a=(int *)malloc(10*sizeof(int));
    int i;
    for (i=0;i<1000000; i++) a[i] = i;
    return 0;
}

In my machine, this causes SIGSEGV only at around i = 37000! (tested by inspecting the core with gdb).

To guard against these problems, test your programs using a malloc debugger... and use lots of mallocs, since there are no memory debugging libraries that I know of that can look into static memory. Example: Electric Fence

gcc -g -Wall docore.c -o c -lefence

And now the SIGSEGV is triggered as soon as i=10, as would be expected.

tucuxi
+1  A: 

As everyone says, your program has undefined behaviour. In fact your program has more bugs than you thought it did, but after it's already undefined it doesn't get any further undefined.

Here's my guess about why there was no output. You didn't completely disable optimization. The compiler saw that the code in function() doesn't have any defined effect on the rest of the program. The compiler optimized out the call to function().

Windows programmer
Is there an option for gcc to disable optimization?
OK, http://linux.die.net/man/1/gcc is too much to read in one day, but look for -O0 (that's a minus followed by capital O for optimization and digit 0 for none).
Windows programmer
+1  A: 

Odds are that the long string is, in fact, terminated by the zero byte in i. Assuming that the variables in main are laid out in the order they are declared -- which isn't required by anything in the language spec that I know of but seems likely in practice -- then large_string would be first in memory, followed by i. The loop sets i to 0 and counts up to 255. Whether i is stored big-endian or little-endian, either way it has a zero byte in it. So in traversing large_string, at either byte 256 or 257 you'll hit a null byte.

Beyond that, I'd have to study the generated code to figure out why this didn't blow. As you seem to indicate, I'd expect that the copy to buffer would overwrite the return address from the strcpy, so when it tried to return you'd be going into deep space some where and would quickly blow up on something.

But as others say, "undefined" means "unpredictable".

Jay