views:

853

answers:

3

I've been wondering how scanf()/printf() actually works in the hardware and OS levels. Where does the data flow and what exactly is the OS doing around these times? What calls does the OS make? And so on...

A: 

I think the OS just provides two streams, one for input and the other for output, the streams abstract away how the output data gets presented or where the input data comes from.

so what scanf & printf are doing are just adding bytes (or consuming bytes) from either streams.

bashmohandes
This is the high level abstraction. I wish to know the details of how these streams work with hardware and how the OS manages all the data.
jetru
+11  A: 

scanf() and printf() are functions in libc (the C standard library), and they call the read() and write() operating system syscalls respectively, talking to the file descriptors stdin and stdout respectively (fscanf and fprintf allow you to specify the file descriptor you want to read/write from).

Calls to read() and write() (and all syscalls) result in a 'context switch' out of your user-level application into kernel mode, which means it can perform privileged operations, such as talking directly to hardware. Depending on how you started the application, the 'stdin' and 'stdout' file descriptors are probably bound to a console device (such as tty0), or some sort of virtual console device (like that exposed by an xterm). read() and write() safely copy the data to/from a kernel buffer called a 'uio'.

The format-string conversion part of scanf and printf does not occur in kernel mode, but just in ordinary user mode (inside 'libc'), the general rule of thumb with syscalls is you switch to kernel mode as infrequently as possible, both to avoid the performance overhead of context switching, and for security (you need to be very careful about anything that happens in kernel mode! less code in kernel mode means less bugs/security holes in the operating system).

btw.. all of this was written from a unix perspective, I don't know how MS Windows works.

David Claridge
The context switch sounds slow if it reads each byte individually. Of course, it doesn't really matter in this day and age, but I'm just interested to know if I'm correct in this understanding.
Ray Hidayat
The read and write syscalls take a number of bytes to transfer via the UIO as parameters, so it doesn't have to make a separate syscall for every single byte. You would also think that for simpler input functions like getchar() there'd have to be a separate call for each character, but in fact these days libc is a bit cleverer than that and it keeps a buffer (inside libc). So it can avoid the performance overhead of context switching alot by filling up it's buffer, then processing a bit of that each time you getchar() or scanf(), until the buffer is empty, and only then make another syscall.
David Claridge
On the topic of "it doesn't really matter in this day an age", in fact you'd be surprised how significantly making syscalls all the time would impact performance. Consider that syscalls are *at least* 10 times as slow as a regular function call. If you have a buffer of size 1024 bytes, for example, you are only making 1/1024 as many syscalls. Writing a C implementation of the 'cp' command with and without a buffer is a great example, I'll post it in a few minutes.
David Claridge
I don't know how MS Windows works either. No, seriously. It's a miracle it works at all :-)
paxdiablo
Wow, this sounds good. How does the OS transfer the bytes from the keyboard to its UIO buffer? The read() and write() calls do that, but from where? Where does the bytes come from the keyboard? The keyboard driver?
jetru
read() don't know about the keyboard, it's at a slightly higher layer of abstraction, it just knows about the device node it talks to, such as a console device. The driver for that device will provide a node in the file system that read() can talk to, and it's the driver that has to be able to actually get characters out from the hardware.
David Claridge
Ah ok, I see the big picture now. Thanks David! :)
jetru
Just one detail - it's not safe to assume that read() / write() are used internally. For one, they are POSIX functions, and as such aren't part of the C standard, so depending on the platform completely different low-level functions may be called.
DevSolar
@DevSolar, absolutely, I was just using a typical Linux system as an example.
David Claridge
Luckily for the performance of a lot of simple C code, stdio functions like `scanf` and `printf` don't call `read` or `write` every time. They (usually, I'm simplifying) actually maintain a buffer that they fill with the system call as rarely as they can to reduce the number of switches in and out of the kernel. A typical Windows libc implementation works much the same way, but uses the `ReadFile` and `WriteFile` system calls. The details inside the kernel are different, but the basic abstractions and overall data flow are very similar.
RBerteig
@DevSolar Ofcourse, it differs based on platform. *nix is just a good place to start. I just wanted the idea.It would be cool if someone could add details about what the OS is doing and what the drivers are doing at a more lower level. :)
jetru
@ jetru - scanf() is completely ignorant of what the lower levels of the OS do. It's just a wrapper for fscanf( stdin, ... ). That in turn can be implemented in terms of fgetc() calls. That function, in turn, reads from the stream buffer, and if the buffer is exhausted, triggers some OS-specific function to replenish the buffer. (On POSIX systems, read().) What that function does, on the driver level... that's depending on the type of stream (file, terminal), and the OS kernel, and would be an issue only if you want to do *OS* development. Are there any *specific* questions you have there?
DevSolar
A: 

On my OS I am working with scanf and printf are based on functions getch() ant putch().

ralu
Wow that was something. I managed to change output from uart to TCP/IP when there was single client connected. It was simple non preemptive cooperative microkernel for embedded system.
ralu