tags:

views:

194

answers:

5

For the following snippet of code,

int n;
char buf[100];
int fd = open ("/etc/passwd", O_RDONLY);
n = read ( fd, buf, 100);

How the compiler comes to know that read is a system call not any library function?

How it retrieves the system call number (__NR_read)?

+10  A: 

I very much doubt that the compiler knows it's a system call. It's far more likely that open is in a library somewhere and the code within the library calls the relevant kernel interface.

The assembly output from the simple program:

#include <stdio.h>
int main (void) {
    int fd = open("xyz");
    return 0;
}

is (irrelevant bits removed):

main:
    pushl   %ebp            ; stack frame setup.
    movl    %esp, %ebp
    andl    $-16, %esp
    subl    $32, %esp

    movl    $.LC0, (%esp)   ; Store file name address.
    call    open            ; call the library function.
    movl    %eax, 28(%esp)  ; save returned file descriptor.

    movl    $0, %eax        ; return 0 error code.

    leave                   ; stack frame teardown.
    ret

.LC0:
    .string "xyz"           ; file name to open.

The first thing you'll notice is that there's a call to open. In other words, it's a function. There's not an int 80 or sysenter in sight, which is the mechanism used for proper system calls (on my platform anyway - YMMV).

The wrapper functions in libc are where the actual work of accessing the system call interface is done.

An excerpt from Wikipedia on system calls:

Generally, systems provide a library that sits between normal programs and the operating system, usually an implementation of the C library (libc), such as glibc. This library exists between the OS and the application, and increases portability.

On exokernel based systems, the library is especially important as an intermediary. On exokernels, libraries shield user applications from the very low level kernel API, and provide abstractions and resource management.

The terms "system call" and "syscall" are often incorrectly used to refer to C standard library functions, particularly those that act as a wrapper to corresponding system calls with the same name. The call to the library function itself does not cause a switch to kernel mode (if the execution was not already in kernel mode) and is usually a normal subroutine call (i.e., using a "CALL" assembly instruction in some ISAs). The actual system call does transfer control to the kernel (and is more implementation-dependent than the library call abstracting it). For example, fork and execve are GLIBC functions that in turn call the fork and execve system-calls.

And, after a bit of searching, the __open function is found in glibc 2.9 in the io/open.c file, and weakref'ed to open. If you execute:

nm /usr/lib/libc.a | egrep 'W __open$|W open$'

you can see them in there:

00000000 W __open
00000000 W open
paxdiablo
open is also system call. There's no code for open in any library. How the compiler resolves for open call?
Ganesh Kundapur
Not so, see the update.
paxdiablo
Lets say, i inserted my own system call and compiled the kernel which doesn't have wrapper function in libc.
Ganesh Kundapur
@Ganesh, you _can_ create your own system calls. And you can cal them from C code with `syscall` if you don't want to provide your own wrapper usable by the C compiler. Or you could inline-assemble the int80 yourself if you wanted to.
paxdiablo
jweyrich
Thanks, @jweyrich, updated to fix.
paxdiablo
@Ganesh: If your kernel supports system calls that glibc doesn't have a wrapper for, you *can't* call them directly. `gettid()` is a common example of this.
caf
+4  A: 

read is a library call as far as the compiler is concerned. It just so happens that the libc implementation defines read to generate a software interrupt with the correct number.

doron
I don't think there's any read implementation in libc?
Ganesh Kundapur
@Ganesh Kundapur yes there is. All system calls are wrapped in libc functions. (and there is a generic syscall() function if you want to build up the syscall yourself)
nos
+1  A: 

The compiler can see the declaration of this function in , and it generates object code that makes a call to that function.

Try compiling with gcc -S and you'll see something like:

movl    $100, %edx
movq    %rcx, %rsi
movl    %eax, %edi
call    read

The system call is made from the C library's implementation of read(2).

EDIT: specifically, GNU libc (which is likely what you have on Linux), establishes the relationships between syscall numbers and function names in glibc-2.12.1/sysdeps/syscalls.list. Each line of that file is converted to an assembly language source code (based on sysdeps/unix/syscall-template.S), compiled, and added to the library when libc is built.

Cubbi
yes, but internally it retrives the system call number for the read ( __NR_read) and makes a call assyscall(__NR_read, fd, buf, 100);
Ganesh Kundapur
@Ganesh Kundapur: see edit. The system call is not "retrieved internally", it is compiled into libc.
Cubbi
A: 

open() is a library function, it located in libc.a / libc.so

blaze
+1  A: 

The following is the Android implementation of read in bionic (the Android equivalent for libc)

/* autogenerated by gensyscalls.py */
#include <sys/linux-syscalls.h>

    .text
    .type read, #function
    .globl read
    .align 4
    .fnstart

read:
    .save   {r4, r7}
    stmfd   sp!, {r4, r7}
    ldr     r7, =__NR_read
    swi     #0
    ldmfd   sp!, {r4, r7}
    movs    r0, r0
    bxpl    lr
    b       __set_syscall_errno
    .fnend

You can see that it loads __NR_read into r7 and then calls SWI, SWI is the software interrupt that switches the prcessor into kernel mode. So the compiler needs to know nothing about how to make system calls, libc takes care of it.

doron