ansaurus

Question

core dump files on Linux: how to get info on opened files?

Answer 1

A:

A core dump is a copy of the memory the process had access to when crashed. Depending on how the leak is occurring, it might have lost the reference to the handles, so it may prove to be useless.

lsof lists all currently open files in the system, you could check its output to find leaked sockets or files. Yes, you'd need to have the process running. You could run it with a specific username to easily discern which are the open files from the process you are debugging.

I hope somebody else has better information :-)

Vinko Vrsalovic 2008-09-12 09:50:07

Answer 2

+2 A:

You can try using strace to see the open, socket and close calls the program makes.

Edit: I don't think you can get the information from the core; at most it will have the file descriptors somewhere, but this still doesn't give you the actual file/socket. (Assuming you can distinguish open from closed file descriptors, which I also doubt.)

mweerden 2008-09-12 09:53:50

True! I had forgotten that.

Vinko Vrsalovic 2008-09-12 09:55:33

Answer 3

A:

Another way to find out what files a process has opened - again, only during runtime - is looking into /proc/PID/fd/ , which contains symlinks to open files.

skolima 2008-09-12 09:56:03

Answer 4

+1 A:

mweerden: the process is already gone, and to run strace I'd have to run it again and be able to reproduce the problem. strace is usually good, but in this case doesn't help :-(

Vinko: can you elaborate on the "memory the process had access to when crashed" part? :-) I'd have thought that the memory contains references to all open file descriptors, and that it's "just" difficult to find those references in the core dump?

oliver 2008-09-12 10:06:41

As I just added to my answer, I don't think you're going to get anything from the core dump. So there is not much more than to rerun the program with strace and "hope" it crashes again.

mweerden 2008-09-12 10:30:12

Answer 5

A:

If the program forgot to close those resources it might be because something like the following happened:

fd = open("/tmp/foo",O_CREAT);
//do stuff
fd = open("/tmp/bar",O_CREAT); //Oops, forgot to close(fd)

now I won't have the file descriptor for foo in memory.

If this didn't happen, you might be able to find the file descriptor number, but then again, that is not very useful because they are continuously changing, by the time you get to debug you won't know which file it actually meant at the time.

I really think you should debug this live, with strace, lsof and friends.

If there is a way to do it from the core dump, I'm eager to know it too :-)

Vinko Vrsalovic 2008-09-12 10:15:05

Answer 6

A:

Your best bet is to install a signal handler for whatever signal is crashing your program (SIGSEGV, etc.).

Then, in the signal handler, inspect /proc/self/fd, and save the contents to a file. Here is a sample of what you might see:

Anderson cxc # ls -l  /proc/8247/fd
total 0
lrwx------ 1 root root 64 Sep 12 06:05 0 -> /dev/pts/0
lrwx------ 1 root root 64 Sep 12 06:05 1 -> /dev/pts/0
lrwx------ 1 root root 64 Sep 12 06:05 10 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 Sep 12 06:05 11 -> socket:[124061]
lrwx------ 1 root root 64 Sep 12 06:05 12 -> socket:[124063]
lrwx------ 1 root root 64 Sep 12 06:05 13 -> socket:[124064]
lrwx------ 1 root root 64 Sep 12 06:05 14 -> /dev/driver0
lr-x------ 1 root root 64 Sep 12 06:05 16 -> /temp/app/whatever.tar.gz
lr-x------ 1 root root 64 Sep 12 06:05 17 -> /dev/urandom

Then you can return from your signal handler, and you should get a core dump as usual.

Martin Del Vecchio 2008-09-12 13:55:25

Answer 7

+2 A:

If you have a core file and you have compiled the program with debuging options (-g), you can see where the core was dumped:

$ gcc -g -o something something.c
$ ./something
Segmentation fault (core dumped)
$ gdb something core

You can use this to do some post-mortem debuging. A few gdb commands: br prints the stack, fr jumps to given stack frame (see the output of br).

Now if you want to see which files are opened at a segmentation fault, just handle the SIGSEGV signal, and in the handler, just dump the contents of the /proc/PID/fd directory (i.e. with system('ls -l /proc/PID/fs') or execv).

With these informations at hand you can easily find what caused the crash, which files are opened and if the crash and the file descriptor leak are connected.

terminus 2008-09-15 18:43:05

ansaurus

tags:

views:

answers:

core dump files on Linux: how to get info on opened files?

related questions