views:

147

answers:

3

I've always been curious about

  1. How exactly the process looks in memory?
  2. What are the different segments(parts) in it?
  3. How exactly will be the program (on the disk) & process (in the memory) are related?

My previous question: http://stackoverflow.com/questions/1966920/more-info-on-memory-layout-of-an-executable-program-process

In my quest, I finally found a answer. I found this excellent article that cleared most of my queries: http://www.linuxforums.org/articles/understanding-elf-using-readelf-and-objdump_125.html

In the above article, author shows how to get different segments of the process (LINUX) & he compares it with its corresponding ELF file. I'm quoting this section here:

Courious to see the real layout of process segment? We can use /proc//maps file to reveal it. is the PID of the process we want to observe. Before we move on, we have a small problem here. Our test program runs so fast that it ends before we can even dump the related /proc entry. I use gdb to solve this. You can use another trick such as inserting sleep() before it calls return().

In a console (or a terminal emulator such as xterm) do:

$ gdb test
(gdb) b main
Breakpoint 1 at 0x8048376
(gdb) r
Breakpoint 1, 0x08048376 in main ()

Hold right here, open another console and find out the PID of program "test". If you want the quick way, type:

$ cat /proc/`pgrep test`/maps

You will see an output like below (you might get different output):

[1]  0039d000-003b2000 r-xp 00000000 16:41 1080084  /lib/ld-2.3.3.so
[2]  003b2000-003b3000 r--p 00014000 16:41 1080084  /lib/ld-2.3.3.so
[3]  003b3000-003b4000 rw-p 00015000 16:41 1080084  /lib/ld-2.3.3.so
[4]  003b6000-004cb000 r-xp 00000000 16:41 1080085  /lib/tls/libc-2.3.3.so
[5]  004cb000-004cd000 r--p 00115000 16:41 1080085  /lib/tls/libc-2.3.3.so
[6]  004cd000-004cf000 rw-p 00117000 16:41 1080085  /lib/tls/libc-2.3.3.so
[7]  004cf000-004d1000 rw-p 004cf000 00:00 0
[8]  08048000-08049000 r-xp 00000000 16:06 66970    /tmp/test
[9]  08049000-0804a000 rw-p 00000000 16:06 66970    /tmp/test
[10] b7fec000-b7fed000 rw-p b7fec000 00:00 0
[11] bffeb000-c0000000 rw-p bffeb000 00:00 0
[12] ffffe000-fffff000 ---p 00000000 00:00 0

Note: I add number on each line as reference.

Back to gdb, type:

(gdb) q

So, in total, we see 12 segment (also known as Virtual Memory Area--VMA).

But I want to know about Windows Process & PE file format.

  1. Any tool(s) for getting the layout (segments) of running process in Windows?
  2. Any other good resources for learning more on this subject?

EDIT:

Are there any good articles which shows the mapping between PE file sections & VA segments?

+1  A: 

Run "!address" in WinDbg on the running process. You will see every virtual memory segment in the process with some classification - image, memory mapped file, stack, heap, PEB, TEB, etc.

Windows Internals is always a good reference for things like this.

Here's the first few entries for notepad:

        BaseAddress      EndAddress+1        RegionSize     Type       State                 Protect             Usage
----------------------------------------------------------------------------------------------------------------------
*        0`00000000        0`00be0000        0`00be0000             MEM_FREE    PAGE_NOACCESS                      Free 
*        0`00be0000        0`00bf0000        0`00010000 MEM_MAPPED  MEM_COMMIT  PAGE_READWRITE                     MemoryMappedFile "PageFile"
*        0`00bf0000        0`00bf7000        0`00007000 MEM_MAPPED  MEM_COMMIT  PAGE_READONLY                      MemoryMappedFile "PageFile"
*        0`00bf7000        0`00c00000        0`00009000             MEM_FREE    PAGE_NOACCESS                      Free 
*        0`00c00000        0`00c03000        0`00003000 MEM_MAPPED  MEM_COMMIT  PAGE_READONLY                      MemoryMappedFile "PageFile"
*        0`00c03000        0`00c10000        0`0000d000             MEM_FREE    PAGE_NOACCESS                      Free 
*        0`00c10000        0`00c12000        0`00002000 MEM_MAPPED  MEM_COMMIT  PAGE_READONLY                      MemoryMappedFile "PageFile"
*        0`00c12000        0`00c20000        0`0000e000             MEM_FREE    PAGE_NOACCESS                      Free 
*        0`00c20000        0`00c21000        0`00001000 MEM_PRIVATE MEM_COMMIT  PAGE_READWRITE                     <unclassified> 
*        0`00c21000        0`00c30000        0`0000f000             MEM_FREE    PAGE_NOACCESS                      Free 
*        0`00c30000        0`00c97000        0`00067000 MEM_MAPPED  MEM_COMMIT  PAGE_READONLY                      MemoryMappedFile "\Device\HarddiskVolume2\Windows\System32\locale.nls"
Michael
awesome!! This is what I'm looking for. So many mysteries solved in single day. :)
claws
claws
+2  A: 

Sysinternals VMMap is also an excellent tool for visualizing the VA space of a process:

VMMap Screenshot

Paul Betts
Thanks Paul, This is more than what I expected. I absolutely love it.
claws
claws
A: 

Another virtual memory viewer is VMValidator. Visual data of memory layout, plus data on memory pages and memory paragraphs.

As for layout of PE files, I recommend the book Expert .Net 2.0 IL Assembler, chapter 4. Its principally aimed at a managed (.Net) PE file rather than a native one, but it does describe how its all laid out.

Then if you want to see some source code (C++) that reads a PE file you should take a look at PE File Format DLL. There is also a GUI that shows you how to use the DLL. The license for the source is open source and not restricted by the GPL.

EDIT: Another book recommendation would be Inside Microsoft Windows 2000 (3rd Edition) by David A Solomon and Mark E Russinovitch (the guys that wrote VMMap mentioned in a different answer). This book has sections on Memory management right from the Page Table layout through to more macro scale memory management and another chapter all about various issues to do with Process, Threads and related data structures.

Regarding PE layout and Virtual Address layout, a DLL is loaded into a memory area that is on a paragraph boundary (64K on x86), allocated by VirtualAlloc(). The memory protection of the various pages (4K on x86, 8K on x64) inside this is set according to how each section is described in the PE file (read only, read/execute, read/write), etc. Thus knowing the PE file layout is useful, which is why I mentioned it.

If you are planning on experimenting with modifying DLLs or performing instrumentation, having a tool to allow you to easily view the DLL contents is very useful. Hence the link to the PE File Format DLL. Its also a good base to start from for your own specific requirements.

Stephen Kellett
you mean "Expert .NET 2.0 IL Assembler" ( http://www.amazon.com/dp/1590596463 ) book? If it is please add link to the book. It just avoids possible confusions.
claws
+1 Also, I don't want the layout of PE Files. I want the mapping between PE file sections and `Segments` of its processes. Actually, for ELF files there is a "Program Header Table" in the ELF itself which clearly shows this mapping
claws
I did mean "Expert...", don't know how I missed that when writing the post. I've modified post.
Stephen Kellett
"Inside Microsoft Windows 2000 (3rd Edition)" this book seems to be in its 5th edition.Still do you suggest 3rd edition?
claws
I recommend the latest edition. When I looked it up, I found the 3rd edition and assumed that Microsoft had (strangely) not updated it. Good to see that assumption was incorrect.
Stephen Kellett