views:

138

answers:

3

My application is basically a content based router which will route MMS events.

The logger I am using is the one that comes with the OTP framework in SASL mode "*error_logger*"

The issue is ::

I am using a client to generate MMS events with default values. This client (in Java) has the ability to send high load of events in multiple THREADS

I am sending 100 events in 10 threads (each thread sending 10 MMS events) to the my router written in erlang/OTP.

The problem is , when such high load is received by my router , my Logger hangs i.e it stops updating my Log file. But the router is still able to route the events.

The conclusions that I have come up with is ::

1) Scheduling problem in erlang when such high load of events is received ( a seperate erlang process for each event).

2) A very unlikely deadloack state.

3)Might be due to sending events in multiple threads rather than sending them sequentially. But I guess a router will be connected to multiple service provider boxes , so I thought of sending events in threads.

Can anybody help mw in demystifying the problem?

+2  A: 

Which version of Erlang are you using? Prior to R14A (R13B4 maybe?), there was a performance penalty when you invoked a selective receive when the message queue contained a lot of messages. This behaviour meant that in a process that receives lots of messages (error_logger being the canonical example), if it was barely keeping up with the load then a small spike in load could cause the cost of processing to spike up and stay there as the new processing cost was higher than the process could bear. This problem has been solved in R14A.

Secondly - why are you sending a high volume of events/calls/logs to a text logger? Formatting strings for output to a human readable log file is a lot more expensive than using a binary disk_log for instance. Reducing the cost of logging will help, but reducing the volume of logs will help even more. Maybe investigate exactly why you need to log these things and see if you can't record them another (less expensive) way.

Problems with error_logger are often symptoms of some other overload problem. Try looking at the message queue sizes for all your processes when this problem occurs and see if something else is backed up too. The following erlang shellcode might help:

[ { P, element(2, process_info(P, message_queue_len)) } 
  || P <- erlang:processes(), is_process_alive(P) ]
archaelus
Thanks a lot archaelus . I will investigate more on your solution(s).
arn_ml
+1  A: 

You already have a good answer, but I'll add to the discussion.

The error_logger is by default using cached write operations to disk. So one possibility is that you don't really notice this while under low load, but under high load your writes get stuck in the cache for a while.

On a side note: there should be no problem having multiple threads doing calls to Erlang.

Another way of testing this is to add your own logger to error_logger, and see what happens. Possibly printing to the shell or something else that is "fast".

Daniel Luna
I wanted a formatted output in the log as I am parsing it to display number of successful events and their average response times but not in real time.Would'nt using io:format to print log messages in shell will slow down the performance? And in my application I am running the Erlang virtual machine as a daemon.
arn_ml
And Thanks for your answer Daniel. :)
arn_ml
io:format printing will of course not show if you run your erlang VM as a daemon. Unless you write to a file of course. io:format(Fd, Txt, Args). I would recommend using the error logger rather than using io:format though. It has a bunch of nice properties that you want when your system grows a bit.
Daniel Luna
A: 

Hi Archaelus, I tried running my aplpication on Erlang 14A 64 bit release..but still facing the same issue :( . And what I am actually logging is the time and some extra details of the events that were successfully processed by my application..which comes out to be 3/4th of a complete line. I need these logs as my external script will be using the information from the logs to display number of successfull events in a given time period..

Till now my error_logger has only hung but never crashed. If it ever crashes will it bring my whole application to ground ??

arn_ml
please help. :(
arn_ml
Guys, I am using log4erl now giving much better performance that error_logger. :)
arn_ml