tags:

views:

94

answers:

5

I'm developing tool, which takes Java code and generate code for getting estimations for execution time of basic blocks, loops and methods. After some block is executed, we put it's time to our tool. The program model is stored in next representation

static Hashtable<String, Hashtable<Integer, Hashtable<String, pair>>> method2Data = new Hashtable<String, Hashtable<Integer, Hashtable<String, pair>>>();

static Hashtable<String, Vector<String>> class2Method = new Hashtable<String, Vector<String>>();

And a time putted by a method

 public static void addBlock(int id, double time, String meth, String thread);

But I have next problem. On each call of addBlock, we take something from method2data. Since we could have a code like

for (int i = 0; i < n; i++)
  for (int j = 0; j < n; j++)
    for (int k = 0; k < n; k++) {
      addBlock(0,...);
      addBlock(m,...);
    }

we call addBlock a lot of time. Unfortunately, after some constant time of working on our claster, program just stop working. It still looks like process, but doesn't take any cpu. I found, that if I remove code, which take something from method2data, then all is ok. So, I guess, that there is some problem with accesses to Hashtable. Have anybody good ideas?

Thanks to all, it seems, that I have a deadlock in case of concurrent accesses and perhaps could run out of memory when there is no concurrent things.

+3  A: 

If you are using Java5 or above, you shouldn't use Hashtable, but ConcurrentHashMap, which provides much better scalability via lock striping, thus might solve the problem instantly (in case you have a deadlock or starvation issue, which is a possibility based on your - incomplete - description). And on the same line, don't use Vector, but some List implementation instead. Both Hashtable and Vector are old collection implementations which are synchronized, in this Vector's case possibly unnecessarily.

[Update] As @finnw rightly pointed out, even if the ConcurrentHashMap seems to help, you can't be sure that the root cause is really solved. Thorough analysis and testing is required to see whether or not your code is actually thread safe. Without seeing the code in addBlock(), we can't conclude on this. [/Update]

Also, as others noted, keeping the profile data in memory is not a good idea for large programs, as this may influence the performance characteristics you are trying to measure, and you may even run out of memory.

Péter Török
-1. This is dangerous because if there is a deadlock, it will occur less frequently with a `ConcurrentHashMap`, making the bug appear to be fixed when it is not.
finnw
@finnw, it may occur less frequently, or may cease to occur completely - impossible to say without seeing the concrete code. You are right that this is a risk, I updated my answer to clarify.
Péter Török
@Péter TörökThank you, after all this thoughts I guess, that I need to rewrite this part of my code and make it thread safe for sure.
soh
+1  A: 

I don't quite understand how your home-built profiling mechanism is working. But it looks like it's being done inside the program you're checking, and it's using up a lot of memory inside that program.

Just as an aside, you're using Vectors in that HashMap. Vectors are synchronized and therefore somewhat slower than, say ArrayLists. You could probably pull a tiny bit of performance out of changing that.

Back to the main problem. Other profiling tools take a different approach: Rather than building an in-memory data structure of results, they log everything to a file. Later, once the original program has run, another program reads, digests and analyzes the log file. It turns out that writing to a buffered file is more consistent and less intrusive in terms of time/memory useage than grafting your instrumentation into your program.

Carl Smotricz
+3  A: 

Just a random thought: Did you maybe run out of memory?

Try to start your application with -Xmx512m to allow for 512 MB of heap space.

Running out of memory usually slows down your process until it appears to do nothing, because it is calling garbage collection after every other instruction.

Stroboskop
Well, that was my first thought, but changing from -Xmx2000M to -Xmx2500 didn't make any sense in constant time of working. Also, I looked at "top" command, while running the program, there was a free memory.
soh
+3  A: 

Hashtables in Java can trigger nasty behavior in the GC, especially when they are long lived and big. The same is true for HashMap. To figure out if this is the case for you, check how much CPU the process needs. If it needs a lot of CPU, then the GC is running. If it doesn't need any CPU (just hangs), then you have a deadlock.

To find the cause of the deadlock, create a thread dump and analyze it. If you're using Eclipse, you may want to look at Lockness.

Aaron Digulla
A: 

It looks pretty much like deadlock. Try to switch from Hashtable to ConcurrentHashMap or investigate your application and add additional locks to resolve deadlocking.

Vadim Fedorov