A file that is given as input to the linker is called Object File. The linker produces an Image file, which in turn is used as input by the loader.
A blurb from Microsoft Portable Executable "and Common Object File Format Specification"
RVA (relative virtual address). In an image file, the address of an item after it is loaded into memory, with the base address of the image file subtracted from it. The RVA of an item almost always differs from its position within the file on disk (file pointer).
In an object file, an RVA is less meaningful because memory locations are not assigned. In this case, an RVA would be an address within a section (described later in this table), to which a relocation is later applied during linking. For simplicity, a compiler should just set the first RVA in each section to zero.
VA (virtual address). Same as RVA, except that the base address of the image file is not subtracted. The address is called a “VA” because Windows creates a distinct VA space for each process, independent of physical memory. For almost all purposes, a VA should be considered just an address. A VA is not as predictable as an RVA because the loader might not load the image at its preferred location.
Even after reading this, I still don't get it. I've lot of questions. Can any one explain it in a practical way. Please stick to terminology of Object File
& Image File
as stated.
All I know about addresses, is that
- Neither in the Object File nor in the Image File, we don't know the exact memory locations so,
- Assembler while generating Object File computes addresses relative to sections .data & .text (for function names).
- Linker taking multiple object files as input generates one Image file. While generating, it first merges all the sections of each object file and while merging it recomputes the address offsets again relative to each section. And, there is nothing like global offsets.
If there is some thing wrong in what I know, please correct me.
EDIT:
After reading answer given Francis, I'm clear about whats Physical Address, VA & RVA and what are the relation between them.
RVAs of all variables&methods must be computed by the Linker during relocation. So, (the value of RVA of a method/variable) == (its offset from the beginning of the file)?
must been true. But surprisingly, its not. Why so?
I checked this by using PEView on "c:\WINDOWS\system32\kernel32.dll" and found that:
- RVA & FileOffset are same till the beginning of Sections.(.text is the first section in this dll).
- From the beginning of .text through .data,.rsrc till the last byte of last section (.reloc) RVA & FileOffset are different. & also the RVA of first byte of the first section is "always" being shown as 0X1000
- Interesting thing is that bytes of each section are continuous in FileOffset. I mean another section begins at the next byte of a section's last byte. But if I see the same thing in RVA, these is a huge gap in between RVAs of last byte of a section and first byte of next section.
My Guess:
All, the bytes of data that were before the first (.text here) section are "not" actually loaded into VA space of the process, these bytes of data are just used to locate & describe these sections. They can be called, "meta section data".
Since they are not loaded into VA space of process. the usage of the term RVA is also meaningless this is the reason why RVA == FileOffset for these bytes.
Since,
- RVA term is valid for only those bytes which will be actually loaded into the VA space.
- the bytes of .text, .data, .rsrc. .reloc are such bytes.
- Instead of starting from RVA 0x00000 PEView software is starting it from 0x1000.
I cannot understand why the 3rd observation. I cannot explain.