tags:

views:

258

answers:

6

I'm considering picking up some very rudimentary understanding of assembly. My current goal is simple: VERY BASIC understanding of GCC assembler output when compiling C/C++ with the -S switch for x86/x86-64.

Just enough to do simple things such as looking at a single function and verifying whether GCC optimizes away things I expect to disappear.

Does anyone have/know of a truly concise introduction to assembly, relevant to GCC and specifically for the purpose of reading, and a list of the most important instructions anyone casually reading assembly should know?

A: 

I'm sure there are introductory books and web sites out there, but a pretty efficient way of learning it is actually to get the Intel references and then try to do simple stuff (like integer math and Boolean logic) in your favorite high-level language and then look what the resulting binary code is.

500 - Internal Server Error
Michael Madsen
If you are compiling for x86 you can use the compiler flag -masm=intel to get gcc to output assembly that looks more like Intel's manuals.
nategoose
+1  A: 

"casually reading assembly" lol (nicely)

I would start by following in gdb at run time; you get a better feel for whats happening. But then maybe thats just me. it will disassemble a function for you (disass func) then you can single step through it

If you are doing this solely to check the optimizations - do not worry.

a) the compiler does a good job

b) you wont be able to understand what it is doing anyway (nobody can)

pm100
Sometimes I find optimized code easier to read because it notices where it's being redundant and changes it to something like I would write.
avpx
For myself, I know that it is a good idea to do this solely for checking optimizations. The reason is that every time I see the compiler actually doing something smart about *situation X*, I will not spend any time in the future *wondering*. avpx also has a very good point.
+1, that's a great idea, I've added `disass func` to a CW on gdb: http://stackoverflow.com/questions/2588853/the-community-driven-gdb-primer/2611474#2611474. By all means feel free to edit what I've put there.
Ninefingers
A: 

Unlike higher-level languages, there's really not much (if any) difference between being able to read assembly and being able to write it. Instructions have a one-to-one relationship with CPU opcodes -- there's no complexity to skip over while still retaining an understanding of what the line of code does. (It's not like a higher-level language where you can see a line that says "print $var" and not need to know or care about how it goes about outputting it to screen.)

If you still want to learn assembly, try the book Assembly Language Step-by-Step: Programming with Linux, by Jeff Duntemann.

Dan Story
I don't agree (but wouldn't downvote for that reason); it's much easier to understand something that's before you that's known to be well-formed and to create that well-formed code yourself. Being able to read assembly can certainly help /edit/ assembly, but being able to read it is a far cry from being able to author even trivial functionality from scratch. I may be able to sort of understand when people talk to me in the foreign languages that I've studied, but I sure can't speak any of them in well formed ways!
dash-tom-bang
+2  A: 

I usually hunt down the processor documentation when faced with a new device, and then just look up the opcodes as I encounter ones I don't know.

On Intel, thankfully the opcodes are somewhat sensible. PowerPC not so much in my opinion. MIPS was my favorite. For MIPS I borrowed my neighbor's little reference book, and for PPC I had some IBM documentation in a PDF that was handy to search through. (And for Intel, mostly I guess and then watch the registers to make sure I'm guessing right! heh)

Basically, the assembly itself is easy. It basically does three things: move data between memory and registers, operate on data in registers, and change the program counter. Mapping between your language of choice and the assembly will require some study (e.g. learning how to recognize a virtual function call), and for this an "integrated" source and disassembly view (like you can get in Visual Studio) is very useful.

dash-tom-bang
+5  A: 

First of all, AT&T-style assembly has always bothered me. That's why I compile with the -masm=intel argument when I want to look at generated assembly code. The --save-temps options saves temporary files (preprocessed source, assembly output, unlinked object file) in the directory GCC is called from.

Getting a superficial understanding of x86 assembly should be easy with all the resources out there. Here's one such resource: http://www.cs.virginia.edu/~evans/cs216/guides/x86.html .

You can also just use disasm and gdb to see what a compiled program is doing.

Yktula
That article is a short and nice read, thanks.
+8  A: 

You should use GCC's -fverbose-asm option. It makes the compiler output additional information (in the form of comments) that make it easier to understand the assembly code's relationship to the original C/C++ code.

Wyzard
Good to know, thanks.