ansaurus

Question

off-by-one error with string functions (C/C++) and security potentials

Answer 1

+5 A:

The only off-by-one error I see here is this line:

buffer[sizeof(buffer)] = '\0';

Is that what you're talking about? I'm not an expert on these things, so maybe I've overlooking something, but since the only thing that will ever get written to that wrong byte is a zero, I think the possibilities are quite limited. The attacker can't control what's being written there. Most likely it would just cause a crash, but it could also cause tons of other odd behavior, all of it specific to your application. I don't see any code injection vulnerability here unless this error causes your app to expose another such vulnerability that would be used as the vector for the actual attack.

Again, take with a grain of salt...

rmeador 2009-05-05 21:45:28

Correct, does not look like a security issue since it doesn't occur with data under an attackers control.

Michael 2009-05-05 21:48:10

This is an illegal memory access. The op needs to use: buffer[sizeof(buffer)-1];.

BobbyShaftoe 2009-05-06 03:02:23

On Windows, overwriting an SEH structure and then generating an access violation could be interesting. On the other hand, if this code always crashes, there may be some hope of finding it during testing, given reasonable test coverage.

bk1e 2009-05-06 18:46:37

Answer 2

+2 A:

Read Shell Coder's Handbook 2nd Edition for lots of information.

Jonathan Leffler 2009-05-05 21:47:14

Answer 3

+2 A:

Disclaimer: This is inferred knowledge from some research I just did, and should not be taken as gospel.

It's going to overwrite part or all of your saved frame pointer with a null byte - that's the reference point that your calling function will use to offset it's memory accesses. So at that point the calling function's memory operations are going to a different location. I don't know what that location will be, but you don't want to be accessing the wrong memory. I won't say you can do anything, but you might be able to do something.

How do I know this (really, how did I infer this)? Smashing the stack for Fun and Profit by Aleph One. It's quite old, and I don't know if Windows or Compilers have changed the way the stack behaves to avoid these problems. But it's a starting point.

example1.c:
void function(int a, int b, int c) {
   char buffer1[5];
   char buffer2[10];
}

void main() {
  function(1,2,3);
}
To understand what the program does to call function() we compile it with gcc using the -S switch to generate assembly code output: $ gcc -S -o example1.s example1.c

By looking at the assembly language output we see that the call to function() is translated to:
     pushl $3
     pushl $2
     pushl $1
     call function
This pushes the 3 arguments to function backwards into the stack, and calls function(). The instruction 'call' will push the instruction pointer (IP) onto the stack. We'll call the saved IP the return address (RET). The first thing done in function is the procedure prolog:
     pushl %ebp
     movl %esp,%ebp
     subl $20,%esp
This pushes EBP, the frame pointer, onto the stack. It then copies the current SP onto EBP, making it the new FP pointer. We'll call the saved FP pointer SFP. It then allocates space for the local variables by subtracting their size from SP.

We must remember that memory can only be addressed in multiples of the word size. A word in our case is 4 bytes, or 32 bits. So our 5 byte buffer is really going to take 8 bytes (2 words) of memory, and our 10 byte buffer is going to take 12 bytes (3 words) of memory. That is why SP is being subtracted by 20. With that in mind our stack looks like this when function() is called (each space represents a byte):
bottom of                                                            top of
memory                                                               memory
           buffer2       buffer1   sfp   ret   a     b     c
<------   [            ][        ][    ][    ][    ][    ][    ]

top of                                                            bottom of
stack                                                                 stack

Tom Ritter 2009-05-05 21:53:41

as far as software is concerned (on x86, at least), memory is addressable at the byte level. In the actual hardware, memory is addressed in blocks that are far larger than the word size. Perhaps it is padding the arrays as an optimization.

rmeador 2009-05-05 22:07:53

x86 is byte addressable, but fetching 32-bit quantities from non-32-bit aligned addresses is more costly. Padding the buffers to multiples of 4 bytes makes it easier to keep future locals, parameters, and pointers naturally aligned on the stack.

Michael 2009-05-05 22:39:33

Answer 4

+1 A:

What can malicious attackers do if she figured out how the function foo() works? Basically, to what kind of security potential problems is this code vulnerable?

This is probably not the best example of a bug that could be easily exploited for security purposes although it could exploited to potentially crash the code simply by using a string of 64-characters or longer.

While it certainly is a bug that will corrupt the address immediately after the array (on the stack) with a single zero byte, there is no easy way for a hacker to inject data into the corrupted area. Calling the printf() function will push parameters on the stack and may clear the zero that was written out of array bounds and lead to a potentially unterminated string being passed to printf.

However, without intimate knowledge of what goes on in printf (and needing to exploit printf as well as foo), a hacker would be hard pressed to do anything other than crash your code.

FWIW, this is a good reason to compile with warnings on or to use functions like strncpy_s which both respects buffer size and also includes a terminating null even if the copied string is larger than the buffer. With strncpy_s, the line "buffer[sizeof(buffer)] = '\0';" is not even necessary.

Adisak 2009-05-05 22:51:42

Answer 5

A:

The issue is that you don't have permission to write to the item after the array. When you asked for 64 chars for buffer, the system is required to give you at least 64 bytes. It's normal for the system to give you more than that -- in which case the memory belongs to you and there is no problem in practice -- but it is possible that even the first byte after the array belongs to "somebody else."

So what happens if you overwrite it? If the "somebody else" is actually inside your program (maybe in a different structure or thread) the operating system probably won't notice you trampled on that data, but that other structure or thread might. There's no telling what data should be there or how trampling over it will affect things.

In this case you allocated buffer on the stack, which means (1) the somebody else is you, and in fact is your current stack frame, and (2) it's not in another thread (but could affect other local variables in the current stack frame).

Max Lybbert 2009-05-06 01:56:05

ansaurus

tags:

views:

answers:

off-by-one error with string functions (C/C++) and security potentials

related questions