ansaurus

Question

What is the overhead involved in a mode switch

Answer 1

+1 A:

There should be no CPU cache or TLB flush on a simple mode switch.

A quick test tells me that, on my Linux laptop it takes about 0.11 microsecond for a userspace process to complete a simple syscall that does an insignificant amount of work other than the switch to kernel mode and back. I'm using getuid(), which only copies a single integer from an in-memory struct. strace confirms that the syscall is repeated MAX times.

#include <unistd.h>
#define MAX 100000000
int main() {
  int ii;
  for (ii=0; ii<MAX; ii++) getuid();
  return 0;
}

This takes about 11 seconds on my laptop, measured using time ./testover, and 11 seconds divided by 100 million gives you 0.11 microsecond.

Technically, that's two mode switches, so I suppose you could claim that a single mode switch takes 0.055 microseconds, but a one-way switch isn't very useful, so I'd consider the there-and-back number to be the more relevant one.

Eric Seppanen 2009-12-07 18:26:20

Answer 2

A:

There are many ways to do a mode switch on the x86 CPUs (which I am assuming here). For a user called function, the normal way is to do a Task jump or Call (referred to as Task Gates and Call Gates). Both of these involve a Task switch (equivalent to a context switch). Add to that a bit of processing before the call, the standard verification after the call, and the return. This rounds up the bare minimum to a safe mode switch.

As for Eric's timing, I am not a Linux expert, but in most OS I have dealt with, simple system calls cache data (if it can be done safely) in the user space to avoid this overhead. And it would seem to me that a getuid() would be a prime candidate for such data caching. Thus Eric's timing could be more a reflection of the overhead of the pre-switch processing in user space than anything else.

Juice 2009-12-07 18:56:55

No data caching is happening; that's why I used strace to verify that the system call is taking place. A syscall, by definition, is a call into kernel space.

Eric Seppanen 2009-12-07 19:11:41

Also, your assertion that a mode switch is equivalent to a context switch is not true. A context switch involves swapping out the entire CPU state and page tables; this is significant work that is not required for system calls. A syscall is a simple software interrupt (x86 assembly "int $0x80")

Eric Seppanen 2009-12-07 19:24:21

So in the syscall there is only a preamble and an INT instruction?

Juice 2009-12-07 19:47:51

Inform yourself, a software interrupt to go into privileged mode on a x86, has to be made through a Trap or Interrupt gate and they are the same as a Call gate, which involves a full saving of the state of the current task, followed by a load for the destination Task.

Juice 2009-12-07 19:52:07

getuid() on i386 Linux is mov $0xC7,%eax ; int $0x80. Disassemble it yourself and see. Note that the syscall number in being passed in eax (0xC7=199=getuid); that demonstrates that CPU registers survive the interrupt. And of course there is no effect on page tables, TLB, or CPU caches, which are the important costs of a context switch.

Eric Seppanen 2009-12-07 21:27:54

Juice: The `int 0x80` syscall entry on x86 Linux uses a Trap Gate, which does *not* change the task (`tr` register). It merely specifies a Segment Selector and Offset to jump to. The Segment Selector itself includes the new privilege level (DPL) - and because we're changing privilege level, we get a new stack segment and stack pointer as well - these are loaded from the TSS.

caf 2009-12-07 23:24:56

It seems you have read a bit on it but are still missing some. Unfortunately these comments are too small to debate the details of Task switching. But let me say this: It is not safe to not use a Task switch for anything accessing internal OS data (BTW this is a hardware task, not a thread, process or other OS entity). And I refuse to believe that Linux is unsafe. What might be happening is an Interrupt gate to a conforming code segment (no task switch, very fast and efficient) where some code makes the decision to either do a Call gate or return cached info.

Juice 2009-12-08 16:57:17

I think you're mistaken - the Linux kernel hasn't used hardware tasks for some time. There is just one TSS per cpu, plus a double-fault handler TSS on i386 (double fault is the only place task gates are used). If it was unsafe I'm sure one of the Intel employees that are heavily involved in Linux kernel development would have said so by now. See the comment at line 35 in `init_task.c`: http://lxr.linux.no/#linux+v2.6.32/arch/x86/kernel/init_task.c

caf 2009-12-08 21:30:17

I was mistaken. For my defense, it has been a while since I played with this stuff. You can do a safe, higher privilege call without a Task switch (with a Call, Interrupt or Trap gate). My apologies.

Juice 2009-12-10 20:05:32

ansaurus

tags:

views:

answers:

What is the overhead involved in a mode switch

related questions