tags:

views:

896

answers:

3

We are developing a port of the GNU Assembler for a client architecture. Now the problem being faced is that:

If an immediate operand to an instruction is an expression involving more than one relocatable symbols, how is it handled in output file in elf format. What will be the relocation information produced in such a case?

For example:

j label1 + label2

where label1 and label2 are defined in relocatable sections, they might be the same sections or different relocatable sections.

A: 

I know jack about ELF and only a little more about linking but...

I would expect that each operand is handled the same way that it would be if there was only one.

OTOH might the issue be that the format for j alters depending on where the labels are? If so, I think you're sunk as linkers aren't smart enough to do that sort of thing (the ADA build system IIRC might be smarter than most so you might look at it.)

BCS
+5  A: 

ELF doesn't know about instructions, per se. It knows about particular encodings of symbol offsets within instructions. In the assembler, you would need to output two relocation records, each with the corresponding [address,type,symbol] triplet to properly patch that portion of the instruction. The linker wouldn't necessarily even know that these two records point to the same instruction.

The ELF relocation types are completely CPU-dependent (or, to be more precise, ISA-dependent), so you are free to define whatever relocations you need for a new architecture.

It's hard to be more specific without details of the instruction encoding.

A: 

I would expect one entry per address for every instruction that needs relocation.

Objdump might be able to display the relocation table of an executable or object file, but I don't know the flags offhand.

My suggestion is to try to dig up an x86 (or other CISC) instruction that does something similar to what your client arch does, and see what relocations are generated when you assemble/link it.

HUAGHAGUAH