views:

656

answers:

6

Hi;

It's a known fact that Windows applications usually have 2Gb of private addess space on a 32bit system. This space can be extended to 3Gb with the /3Gb switch.

The operating system reserves itself the remaining of the 4Gb.

My question is WHY?

Code running in kernel mode (ie device driver code) has its own address space. Why, on top of a exclusive 4Gb address space, the operating system still want to reserve 2Gb of each user-mode process?

I thought the reason is the transition between user-mode and kernel-mode call. For example, a call to NtWriteFile will need an adress for the kernel dispatch routine (hence why the system reserve 2Gb in each application). But, using SYSENTER, isn't the system service number enough for the kernel-mode code to know which function/service is being called?

If you could clarify to me why it's so important for the operating system to take 2Gb (or 1Gb) of each user-mode process, I'll be thankful.

Thank you.

+1  A: 

I believe the best answer is that the OS designers felt that by the time you would have to care, people would be using 64-bit Windows.

But here's a better discussion.

Dave Markle
It's not only a Windows thing, Linux also reserves a part of the address space for the kernel.I just can't understand why since code running in kernel mode has it's own adderss space.
Well, the hardware itself also often likes to have memory of its own, and that's probably most of the reason.
Dave Markle
The hardware still can get memory of its own in the kernel address space.
+8  A: 

Raymond Chen had a bunch of articles on this topic.

Sinan Ünür
A: 

Part of the answer is to do with the history of microprocessor architectures. Here's some of what I know, others can provide more recent details.

The Intel 8086 processor had a segment-offset architecture for memory, giving 20 bit memory addresses, and therefore total addressable physical memory of 1MB.

Unlike competing processors of the era - like the Zilog Z80 - the Intel 8086 had only one address space which had to accommodate not only electronic memory, but all input/output communication with such minor peripherals as keyboard, serial ports, printer ports and video displays. (For comparison, the Zilog Z80 had a separate input/output address space with dedicated assembly opcodes for access)

The need to allow space for an ever growing range of peripheral expansions led to the original decision to segment the address space into electronic memory from 0-640K, and "other stuff" (input/output, ROMS, video memory etc) from 640K to 1MB.

As the x86 line grew and evolved, and PCs evolved with them, similar schemes have been used, ending with todays 2G/2G split of the 4G address space.

Bevan
I don't think this is a suitable explanation on any hardware with a MMU. Memory-mapped I/O exists in the physical address space. The 2GB+2GB virtual address space in use at any given point in time could very well have *none* of the I/O space mapped.
ephemient
I agree, theoretically. From a performance perspective, if an application has a chunk of video data to display, and there's nowhere in that applications memory space to even allocate a buffer, how do you marshal through to the video hardware in an efficient way? Reserving room in the address space to do this provides a way to achieve this. Keep in mind too that the decision on the 2G/2G split was made a LONG time ago - it well predates Windows XP, which itself was targeted at machines with 64M of memory. Plus, my comments are about the (ancient) history, not whether it's a good idea or not.
Bevan
An application that needs a direct video buffer can ask for one to be mapped into its address space; the mapping should not need to exist when not requested. And as I said in my answer, the 2G/2G (or 3G/1G) split is not so fundamental that it can't be changed -- a 4G userspace is simply too expensive to context-switch.
ephemient
+2  A: 

Code running in kernel mode (ie device driver code) has it's own address space.

No it does not. It has to share that address space with the user mode portion of a process on x86 processors. That's why the kernel have to reserve space enough in total and finite the address space.

nos
I am reading the book "Developing drivers with Windows Driver Foundation". Page 35 : Client and driver run in different address space, so the driver must access the buffer carefully. In particular, a driver cannot simply dereference a pointer to a user-mode buffer with any certainty that the data at that address is meaningful or that the pointer is even valid.Before reading that, I thought to that kernel mode code runs in the same address space that whathever process was scheduled at the moment.Maybe the book is wrong.
The kernel-mode stuff certainly has its own address space. The trick is that Windows is a lot more than kernel-mode stuff.
jalf
The kernel can't dereference a pointer from userspace directly for the simple reason that the userspace program might have lied, and sent in a bad pointer. It'd crash the OS to dereference it. The other reason is that for whatever reason the target of the pointer could be swapped out or not mapped in memory and extra care need to be taken then to not e.g trigger a fault in from an interrupt handler or other sensitive areas.Note, the reason the address spaces arn't seperate is if they were, you would need a TLB flush on every context switch - which would be too expensive to justify.
nos
+8  A: 
ephemient
Ty for this clear answer. So, to summarize, from a design perspective, a kernel can run on it's own address space and introduce a performance penalty due to process context switching, or be mapped in user-mode process address space and eat some of the address space.
Not necessarily. There are others ways of achieving separation, for example static typing. In Microsoft Research's Singularity OS, all user processes and the kernel live in one single address space (indeed, the MMU is actually turned off, as much as is possible on x86). Protection between user processes and between kernel and userspace is guaranteed by the compiler: because all code is written in a memory-safe and type-safe language (C#), and no direct memory access is allowed, there is no need for hardware separation.
Jörg W Mittag
Windows and all other actively deployed OSes I'm aware of are fundamentally based on executing native applications with privilege separation by hardware (or none at all), and incompatible with the Singularity model. Yes, there are a few research OSes which are based on compile-time and VM-enforced safety instead of the hardware-supported virtual address spaces we've been accustomed to for the last few decades. Unfortunately, as cool as the idea may be, it is a "rewrite everything" scenario...
ephemient
+2  A: 

Windows (like any OS) is a lot more than the kernel + drivers.

Your application relies on a lot of OS services that do not just exist in kernel space. There are a lot of buffers, handles and all sorts of resources that can get mapped to your process' own address space. Whenever you call a Win32 API function that returns, say, a window handle, or a brush, those things have to be allocated somewhere in your process. So part of Windows runs in the kernel, yes, other parts run in their own user-mode processes, and some, the ones your application needs direct access to, are mapped to your address space. Part of this is hard to avoid, but an important additional factor is performance. If every Win32 call required a context switch, it would be a major performance hit. If some of them can be handled in usermode because the data they rely on is already mapped to your address space, the context switch is avoided, and you save quite a few CPU cycles.

So any OS needs some amount of the address space set aside. I believe Linux by default sets only 1GB for the OS.

The reason why MS settled on 2GB with Windows was explained on Raymond Chen's blog once. I don't have the link, and I can't remember the details, but the decision was made because Windows NT was originally targeted at Alpha processors as well, and on Alpha's, there was some REALLY good reason to do the 50/50 split. ;)

It was something to do with the Alpha's support for 32 as well as 64-bit code. :)

jalf
Ty for your answer. I can add this from Windows Internals : "The system service dispatcher, KiSystemService, copies the caller's arguments from the thread's user-mode stack to its kernel-mode stack (so that the user can't change the arguments as the kernel is accessing them), and then executes the system service. If the arguments passed to a system service point to buffers in user space, these buffers must be probed for accessibility before kernel-mode code can copy data to or from them."So basically, the kernel like to be present in process user address space for performance reason.