ansaurus

Question

Answer 1

+2 A:

It looks like a memory problem. I'd get a dump on memory usage, see how it is behaving. If the gc times are increasing too much, you have your culprit. You could, then, just increase the memory available to the JVM to get it going again.

Anyway, don't convert batches into a List. It is unnecessary. It would be necessary if you were using for/yield (on Scala 2.7), but since you are not yielding anything, then Range is a better choice.

Daniel 2009-12-07 20:23:29

Sorry. I should have included that it does do a yield because it keeps a list of the Futures then does awaitAll() to wait for them to be done before moving onto the next section.Memory usage could be the issue but I'm not sure why it wouldn't be releasing the memory as I can't spot anything leaky. I'm allocating 800M as it currently stands.

Wysawyg 2009-12-07 20:36:13

@Wysawyg - have you used the `jconsole` (`$JAVA_HOME/bin/jconsole`) to attach to the application? This is very good for telling you a few things: 1. Is the app spending all of its time doing GC? 2. What are my threads doing?

oxbow_lakes 2009-12-07 21:48:31

Answer 2

+2 A:

The jconsole application, which comes bundled with the JDK (in $JAVA_HOME/bin/jconsole) can be used to attach to the application as it runs. This is very good for telling you a few things:

Is the app spending all of its time doing GC?
What are the application threads doing?

Could you post the results here?

oxbow_lakes 2009-12-07 21:50:01

Hey, thanks for the suggestion. I'm running jconsole but nothing is standing out as bad. GC has so far spend 2 minutes 30 seconds out of 1 hour 8 minutes runtime. I can't see anything off about what the threads are doing either: I suppose I need to profile it when it's speeding along and then again once it's started crawling and play spot the difference. Thanks for the advice.

Wysawyg 2009-12-07 22:15:06

The normal thing to spot if the issue is GC, is that the GC number starts to go up more or less in realtie. i.e. it might be 2 mins out of 60 now, and 4 out of 62 in a few minutes' time. This means the last 2 mins was spent entirely in GC

oxbow_lakes 2009-12-07 22:50:12

As an example, I have an app of mine that has spent less than 3 seconds in GC out of 24 hours! Another one (with >20k actors) takes over 10% of its time in GC!

oxbow_lakes 2009-12-07 22:52:17

One thing I have noticed is that the thread responsible for the thread managing is spending most of its time in waiting, more so than at the start.All the other 8 threads are marked as Runnable and have about 1k time in blocked and none in waiting so it seems like those threads are ready to do the work but somehow the work isn't being assigned to them.Does that sound at all plausible?I'm trying it now with separating the full record count into 8 batches and firing off an actor for each and they run each 250 batch. That way I can see if it is anything with my use of actors or my other code.

Wysawyg 2009-12-08 09:51:07

OK - so this seems like a difficult one. As I said, I have an app running over 20k actors processing market data across all regions in real time. This is probably not a bug with actors. If I were you, what I'd start looking for are things like comparisons (possibly hash-collisions?) which are being made by Hibernate, or some operation which is walking over all the results on every batch invocation

oxbow_lakes 2009-12-08 10:50:29

Thanks for all your help. I'd upvote your answer if I had enough reputation. Hopefully I can get this solved before I lose the shreds of my sanity.

Wysawyg 2009-12-08 22:30:04

Answer 3

+2 A:

Try bounding the maximum number of threads that the actor library will create (futures are backed by actors). The actor threads are EXTREMELY heavyweight, and under certain conditions the scheduler will create them like there's no tomorrow. This uses up a ton of heap space and can make your program spend huge amounts of time performing garbage collection.

This can be done by setting the actors.maxPoolSize parameter at the command line...which would be something like: -Dactors.maxPoolSize=32 or whatever the max number of threads you want is.

I also highly recommend running your program -Xprof to see how much time the GC is consuming.

Erik Engbrecht 2009-12-08 23:23:46

ansaurus

tags:

views:

answers:

Scala Concurrency slowing down

related questions