views:

144

answers:

3
+3  Q: 

ELF File Format

I'm attempting to manually load the hexdump of an elf file that I compiled using g++ into a processor simulation I designed. There are 30 sections to a standard elf file and I am loading all 30 segments with their proper memory location offset taken into account. I then start my program counter at the beginning of the .text section (00400130) but it seems that the program isn't running correctly. I have verified my processor design relatively thoroughly using SPIM as a gold standard. The strange thing is that, if I load an assembly file into SPIM, and then take the disassembled .text and .data sections that are generated by the software, load them into my processor's memory, the programs work. This is different from what I want to do because I want to:

  • write a c++ program
  • compile it using mipseb-linux-g++ (cross compiler)
  • hex dump all sections into their own file
  • read files and load contents into processor "memory"
  • run program

Where in the ELF file should I place my program counter initially? I have it at the beginning of .text right now. Also, do I only need to include .text and .data for my program to work correctly? What am I doing wrong here?

+3  A: 

The ELF header should include the entry address, which is not necessarily the same as the first address in the .text region. Use objdump -f to see what the entry point of the file is -- it'll be called the "start address".

The format is described here - you should be using the program headers rather than the section headers for loading the ELF image into memory (I doubt that there are 30 program headers), and the entry point will be described by the e_entry field in the ELF header.

Aidan Cully
Good to know. Turns out the starting address just happens to be 400130 for this format. Well, I know I'm starting at the correct location anyway. To run a program do I need to include any other sections other than text and data? Perhaps rodata? I'm not sure.
Dan Snyder
If you use the program headers (which you should be doing, as you care about the "execution view" of the ELF file, rather than the "linker view"), the sections aren't named. But yes, there are other sections you'll care about. Take a look at the output from `objdump -h` - any sections that include ALLOC or LOAD probably need to be loaded into memory. I'm not completely sure that's the case, because I use the program headers, rather than the section headers, for loading ELF images.
Aidan Cully
`objdump -p` will tell you what the program headers are for the image.
Aidan Cully
You'll need `.rodata`; that's most likely where constant static objects end up. You'll need to allocate memory for the `.bss` (uninitialised data) sections; they're not included in the file. There are a few other code sections as well as `.text` that might be present: `.ctors`, `.dtors`, `.init`, `.fini`.
Mike Seymour
What are the headers for? (I'm not particularly familiar with ELF format)
Dan Snyder
I suggest reading the document I linked to. Quoting: "A program header table, if present, tells the system how to create a process image. Files used to build a process image (execute a program) must have a program header table; relocatable files do not need one. A section header table contains information describing the file’s sections. Every section has an entry in the table; each entry gives information such as the section name, the section size, etc. Files used during linking must have a section header table; other object files may or may not have one." It also describes .bss, .data, etc.
Aidan Cully
+1  A: 

Use the e_entry field of the ELF header to determine where to set the Program Counter.

AJ
Apparently I was setting my PC to the correct location. Previously I was just loading my .text section first and automatically determining my PC start point by referencing the first address in the dump file. I can use this to get the correct value in a more assured manner.
Dan Snyder
+1  A: 

Look into Elf32_Ehdr.e_entry (or Elf64_Ehdr.e_entry if you are on 64-bit platform). You should at least also include the .bss section, which is empty, but has "in-memory" size in the disk ELF image.

Wikipedia will lead you to all necessary documentation.

Edit:

Here's from objdump -h /usr/bin/vim on my current box:

Sections:
Idx Name         Size      VMA               LMA               File off  Algn
...
22 .bss          00009628  00000000006df760  00000000006df760  001df760  2**5
                 ALLOC
23 .comment      00000bc8  0000000000000000  0000000000000000  001df760  2**0
                 CONTENTS, READONLY

Note the File off is the same for both .bss and .comment, which means .bss is empty in the disk file, but should be 0x9628 bytes in memory.

Nikolai N Fetissov
My .bss section isn't empty actually. What could this mean? Also, I'm using a c++ map to represent my memory so any location that isn't assigned a value will default to "0".
Dan Snyder
How do you know it's not empty?
Nikolai N Fetissov
when I run "readelf -x 15 helloworld" the section has elements at many locations. As dense as .text.
Dan Snyder
Oh, never mind, it is quite empty. I must have referenced the wrong section.
Dan Snyder
`objdump` numbers sections from 0, while `readelf` counts them from 1. Look for the heading `Hex dump of section '.XXXX':` - is that `.bss`?
Nikolai N Fetissov
You need to allocate section in memory if it's marked `ALLOC`, and load it from file if marked `LOAD`.
Nikolai N Fetissov
Oh, I see. Technically any arbitrary elements are allocated in memory (assuming that an empty element's contents is "0").
Dan Snyder