views:

256

answers:

1

I have access to a remote Solaris terminal which crashes occasionally, and I have to ask someone with physical access to boot the machine up, which it does successfully. I would like to know which tools/files should I look at to find out the cause of the crash so that I can make the necessary configuration changes and avoid it in the future.

+3  A: 

What tools you can use will depend on what version of solaris you have running and what the actual problem is. The first thing to do is check the system console (which it sounds like you don't have access to) and the /var/adm/messages file. This file is updated with system messages and the newest will appear at the end.

Next, you can look for a system core file. If a core file is created, it would be in /var/crash/hostname where "hostname" is the name of the machine.

If you have an actual core file in the /var/crash/hostname directory, this set of commands will give you a good string to search google with:

# cd /var/crash/hostname

Replace "hostname" with the hostname of your machine.

# mdb -k unix.0 vmcore.0

If you have multiple core files, select the most recent version.

 > ::status

This should give you a panic message, cut and paste that into google and see what you can find.

For more core file analysis read this: http://cuddletech.com/blog/pivot/entry.php?id=965

Alan H