ansaurus

Question

Building an assembler

Answer 1

+2 A:

Look at this Assembler Development Kit from Randy Hyde's author of the famous "The Art of Assembly Language":

The Assembler Developer's Kit

Hernán 2008-12-21 20:10:43

Answer 2

+1 A:

The first pass of a two-pass assembler assembles the code and puts placeholders for the symbols (as you don't know how big everything is until you've run the assembler). The second pass fills in the addresses. If the assembled code subsequently needs to be linked to external references, this is the job of the eponymous linker.

ConcernedOfTunbridgeWells 2008-12-21 20:10:50

Answer 3

+4 A:

I've written three or four simple assemblers. Without using a parser generator, what I did was model the S-C assembler that I knew best for 6502.

To do this, I used a simple syntax - a line was one of the following:

nothing
[label] [instruction] [comment]
[label] [directive] [comment]

A label was one letter followed by any number of letters or numbers.

An instruction was <whitespace><mnemonic> [operands]

A directive was <whitespace>.XX [operands]

A comment was a * up to end of line.

Operands depended on the instruction and the directive.

Directives included .EQ equate for defining constants

.OR set origin address of code

.HS hex string of bytes

.AS ascii string of bytes - any delimiter except white space - whatever started it ended it

.TF target file for output

.BS n reserve block storage of n bytes

When I wrote it, I wrote simple parsers for each component. Whenever I encountered a label, I put it in a table with its target address. Whenever I encountered a label I didn't know, I marked the instruction as incomplete and put the unknown label with a reference to the instruction that needed fixing.

After all source lines had passed, I looked through the "to fix" table and tried to find an entry in the symbol table, if I did, I patched the instructions. If not, then it was an error.

I kept a table of instruction names and all the valid addressing modes for operands. When I got an instruction, I tried to parse each addressing mode in turn until something worked.

Given this structure, it should take a day maybe two to do the whole thing.

plinth 2008-12-21 20:15:59

Thank you for you're answer. Look I have the following problem:LOOP1:LOOP2:LOOP3: ADD r1,r2JMP LOOP1The way i wrote the assembler it will jump to the line that contains LOOP2 and it should jump to the ADD instruction. I am parsing line by line.Did you treat the whole code as a single line ?

John 2008-12-22 09:16:32

You need to store the address of the next instruction to be executed as the value for the label. This means you keep track of all the labels, in your case Loop(1,2,3) and when you get to the next actual instruction (add), then you know the value of the labels, so you go back and fill them in.

Bearddo 2008-12-22 15:08:53

ansaurus

tags:

views:

answers:

Building an assembler

related questions