views:

248

answers:

3

Since migrating a Cruisecontrol build server to a new machine, it sometimes hangs during the "modificationset" stage of the build cycle (it is configured to check for modifications every 15 minutes). Cruisecontrol itself stays responsive, only the build does not progress.

There is no significant load on the CPU when this happens, and I've seen it stay in this state for an hour or more, though it seems to break out of this state eventually. There doesn't seem to be a pattern to which projects it happens to. The hardware is brand new, and I have run a memtest with no problems.

This is the system configuration:

  • Ubuntu 9.04 server, amd64, fully upgraded
  • svn version 1.5.4 (r33841) - the most recent version apt-get will install
  • Sun JRE 64 bit build 1.6.0_16-b01 - again, most recent version
  • CruiseControl 2.7.3 (not the most recent)

This is how my modificationsets look like

<modificationset quietperiod="10">
 <veto><!-- there are several of these -->
  <triggers>
   <svn LocalWorkingCopy="${checkout_dir}/base" />
  </triggers>
  <buildstatus logdir="${log_dir}/base" />
 </veto>
 <timebuild time="2330" />
 <svn LocalWorkingCopy="${checkout_dir}/${project.name}" />
</modificationset>

So what could be done here?

Edit: Here's a excerpt from the cruisecontrol log file, showing projectA hanging at 16:07 (it is still hanging now at 17:48)

2009-10-27 16:07:55,096 [Thread-38860] INFO  Project          - Project projectA:  bootstrapping
2009-10-27 16:07:55,096 [Thread-38860] INFO  ProjectController - projectA Controller: build progress event: bootstrapping
2009-10-27 16:07:55,262 [Thread-38862] INFO  ScriptRunner     - Buildfile: work/build-cruisecontrol.xml
2009-10-27 16:07:59,230 [Thread-38860] INFO  AntBootstrapper  - Bootstrap successful.
2009-10-27 16:07:59,230 [Thread-38860] INFO  Project          - Project projectA:  checking for modifications
2009-10-27 16:07:59,230 [Thread-38860] INFO  ProjectController - projectA Controller: build progress event: checking for modifications
2009-10-27 16:11:14,954 [Project projectB thread] INFO  Project          - Project projectB:  in build queue
+2  A: 

Have you tried issuing the same SVN commands manually from the command line? Does it hang then?

Grzegorz Oledzki
It doesn't hang repeatably in CruiseControl either, just sometimes, ant not always the same project.
Michael Borgwardt
Anyway, did you expierence the same when run manually? Even once? If so, then CruiseControl is not the issue here. Otherwise it's something about the SvnModificationSet (or whatever the class is called).
Grzegorz Oledzki
@GrzegorzOledzki:+1 for the try-by-hand method. good place to start troubleshooting.
Critical Skill
The problem is: since the problem happens sporadically, it tells me *nothing* when it does not occur via the command line - might just be chance. Also, I do not know what the exact command line equivalent would be. "svn log", I guess. I just tried that about 50 times, with no problems.
Michael Borgwardt
Looks like it is "svn log" but with some parameters. You can track the exact command being issued by capturing the following log line: http://cruisecontrol.svn.sourceforge.net/viewvc/cruisecontrol/trunk/cruisecontrol/main/src/net/sourceforge/cruisecontrol/sourcecontrols/SVN.java?view=markup#l_294 (LOG.error("Error executing svn log command " + command, e); in SVN.class)
Grzegorz Oledzki
The easiest approach for checking the actual command being run by CC is to peek at `/proc/$PID/cmdline`.
Grzegorz Oledzki
+1  A: 

Just some pointers:

  1. Does it hang at a particular time of the day? Or is it really random? Any new backups in place which shutdown the service for backup?

  2. Have you compared the config.xml of the new cruise server with that of the old (assuming the cruise version is identical on both, do they have the exact same tasks or is there something that might be slowing down the modificationset task)?

  3. Do the old and the new machines sit on the same networks as your subversion repositories (or atleast do they have similar response times in accessing all the project repositories? ) Given that the cruise itself remains responsive is it possible that the particular project repo it is accessing at the time of near-hang is too big, too slow or has too much going on in the repository?

These are just troubleshooting pointers - so they are by no means actual answers to your question.This is perhaps how I'd aprpoach the problem (besides running the commands manually as in GrzegorzOledzki's answer.

Critical Skill
It doesn't seem to occur at specific times. The cruise control version and configuration are exactly identical, and the new machine is on the same network as the old one, but not the same as the repository (which is managed centrally, different department).
Michael Borgwardt
Update: I have now successfully run a command line svn query against a project while it's hanging in CC - so it's definitely not a recurring general problem on the server
Michael Borgwardt
Perhaps you need to run it against the project dependency rather than the project workspace itself.(since you make use of the <veto> plugin.) Since <veto> would instantly abort/continue with the build if the feedback from the triggers was available one way or the other, perhaps it is reasonable to suspect that it is waiting for the feedback from the project dependency?
Critical Skill
+1  A: 

Another idea. You could always start the CruiseControl JVM in debug mode. And whenever it hangs, connect to it using some IDE, e.g. Eclipse. And then you could all the threads of the CC application, and pause some of them and see what are they busy with.

Grzegorz Oledzki
Good idea, will definitely try that
Michael Borgwardt