views:

112

answers:

1

We have a number of cloud servers and I am building a new one to test provision of resources on.

I am getting a fatal crash way before I think I ought to, as some sort of system resource is running out.

=INFO REPORT==== 14-Feb-2010::12:40:14 ===
Setting up: "http://sub48.localhost:9000" as pirate
Mnesia('[email protected]'): Data may be missing, 
Corrupt logfile deleted: "(...)/sub48.localhost&9000&styles.DCL", {file_error,
"(...)/sub48.localhost&9000&styles.DCL", system_limit} 


=ERROR REPORT==== 14-Feb-2010::12:40:18 ===
Mnesia('[email protected]'): ** ERROR ** (could not write core file: system_limit)
 ** FATAL ** Cannot open log file "(...)/sub48.localhost&9000&styles.DCL": 
{file_error, "(...)/sub48.localhost&9000&styles.DCL", system_limit}

The operating system is Ubunut 8.04 (LTS) but our other ones are Ubuntu 9.04 and Ubuntu 9.10 - I think we will have to standardise them :(

So my questions are:

  • how can I identify what resource is running out?
  • what proactive monitoring steps can I take to ensure that it doesn't happen again?
  • which system resources, in general, might I be able to exhaust with an Erlang VM, and what monitoring steps I should have in place for them?
+1  A: 

There is an erlang module called os_mon which let's you monitor various resources like cpu load. Also check out the sasl OTP application, especially overload and alarm_handler.

snies
Duh, why didn't I think of looking for that sort of stuff there...
Gordon Guthrie