views:

597

answers:

3

I'm just finishing up a computer architecture course this semester where, among other things, we've been dabbling in MIPS assembly and running it in the MARS simulator. Today, out of curiosity, I started messing around with NASM on my Ubuntu box, and have basically just been piecing things together from tutorials and getting a feel for how NASM is different from MIPS. Here is the code snippet I'm currently looking at:

global _start

_start:

    mov eax, 4
    mov ebx, 1
    pop ecx
    pop ecx
    pop ecx
    mov edx, 200
    int 0x80
    mov eax, 1
    mov ebx, 0
    int 0x80

This is saved as test.asm, and assembled with nasm -f elf test.asm and linked with ld -o test test.o. When I invoke it with ./test anArgument, it prints 'anArgument', as expected, followed by however many characters it takes to pad that string to 200 characters total (because of that mov edx, 200 statement). The interesting thing, though, is that these padding characters, which I would have expected to be gibberish, are actually from the beginning of my environment variables, as displayed by the env command. Why is this printing out my environment variables?

+2  A: 

Without knowing the actual answer or having the time to look it up, I'm guessing that the environment variables get stored in memory after the command line arguments. Your code is simply buffer overflowing into the environment variable strings and printing them too.

This actually makes sense, since the command line arguments are handled by the system/loader, as are the environment variables, so it makes sense that they are stored near each other. To fix this, you would need to find the length of the command line arguments and only print that many characters. Or, since I assume they are null terminated strings, print until you reach a zero byte.

EDIT: I assume that both command line arguments and environment variables are stored in the initialized data section (.data in NASM, I believe)

Dan
Yep, that sounds reasonable considering there's a version of C's main function with the prototype 'int main(int argc, char *argv[], char *envp[]);'
Skizz
A: 

As long as you're being curious, you might want to work out how to print the address of your string (I think it's passed in and you popped it off the stack). Also, write a hex dump routine so you can look at that memory and other addresses you're curious about. This may help you discover things about the program space.

Curiosity may be the most important thing in your programmer's toolbox.

I haven't investigated the particulars of starting processes but I think that every time a new shell starts, a copy of the environment is made for it. You may be seeing the leftovers of a shell that was started by a command you ran, or a script you wrote, etc.

gbarry
AFAIK each process get a copy of the environment of its parent, so he is seeing the environment of the shell he ran the program from.
Dan
+1  A: 

In order to understand why you are getting environment variables, you need to understand how the kernel arranges memory on process startup. Here is a good explanation with a picture (scroll down to "Stack layout").

That website is extremely helpful and informative, thanks very much.
dancavallaro