tags:

views:

716

answers:

1

We just acquired three new build slaves for Hudson, which are running Windows XP x64. We're having issues deploying to these that we haven't seen before (we have two other XP32 machines already slaved).

When we first reboot the server, or just after restarting the Server service, the node's log on hudson shows the following (domain name changed to protect the innocent):

Connecting to beast.example.com
Copying slave.jar
The parameter is incorrect.
jcifs.smb.SmbException: The parameter is incorrect.
at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:542)
at jcifs.smb.SmbTransport.send(SmbTransport.java:644)
at jcifs.smb.SmbSession.sessionSetup(SmbSession.java:371)
at jcifs.smb.SmbSession.send(SmbSession.java:235)
at jcifs.smb.SmbTree.treeConnect(SmbTree.java:161)
at jcifs.smb.SmbFile.doConnect(SmbFile.java:858)
at jcifs.smb.SmbFile.connect(SmbFile.java:901)
at jcifs.smb.SmbFile.connect0(SmbFile.java:827)
at jcifs.smb.SmbFile.open0(SmbFile.java:917)
at jcifs.smb.SmbFile.open(SmbFile.java:951)
at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:142)
at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:97)
at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:67)
at jcifs.smb.SmbFile.getOutputStream(SmbFile.java:2793)
at hudson.os.windows.ManagedWindowsServiceLauncher.copySlaveJar(ManagedWindowsServiceLauncher.java:198)
at hudson.os.windows.ManagedWindowsServiceLauncher.launch(ManagedWindowsServiceLauncher.java:152)
at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:175)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
at java.util.concurrent.FutureTask.run(FutureTask.java:123)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)
at java.lang.Thread.run(Thread.java:613)

On any subsequent attempts to "Launch slave service", we get:

Connecting to beast.example.com
Copying slave.jar
0xC0000205
jcifs.smb.SmbException: 0xC0000205
at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:542)
at jcifs.smb.SmbTransport.send(SmbTransport.java:644)
at jcifs.smb.SmbSession.send(SmbSession.java:242)
at jcifs.smb.SmbTree.send(SmbTree.java:111)
at jcifs.smb.SmbFile.send(SmbFile.java:729)
at jcifs.smb.SmbFile.open0(SmbFile.java:934)
at jcifs.smb.SmbFile.open(SmbFile.java:951)
at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:142)
at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:97)
at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:67)
at jcifs.smb.SmbFile.getOutputStream(SmbFile.java:2793)
at hudson.os.windows.ManagedWindowsServiceLauncher.copySlaveJar(ManagedWindowsServiceLauncher.java:198)
at hudson.os.windows.ManagedWindowsServiceLauncher.launch(ManagedWindowsServiceLauncher.java:152)
at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:175)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
at java.util.concurrent.FutureTask.run(FutureTask.java:123)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)
at java.lang.Thread.run(Thread.java:613)

It seems like samba itself, not Hudson, may be the issue. We've double-checked group memberships and directory permissions for C:\hudson and they're identical to the other two slaves.

Using smbclient from the MacOSX server that's actually running Tomcat+Hudson (but does not execute builds), I was able to get a strange response on one attempt:

smb: \hudson\> get hudson-slave.exe
NT_STATUS_INSUFF_SERVER_RESOURCES opening remote file \hudson\hudson-slave.exe

Googling around suggest an IRPStackSize issue might be the culprit, but jacking that up 5 at a time (eventually to 50 = 0x32) and restarting the Server service doesn't seem to help.

As an aside, launching JNLP client works just fine, although we'd prefer to have it as a service.


Hudson version is 1.323, by the way (only one behind, nothing in the changelog looks particularly relevant).

A: 

Looks like JCIFS may have a fix for this. From a coworker:

"jcifs-1.3.10 released / Bugfix for SmbException: The parameter is incorrect
posted by Mike, June 4, 2009
This release fixes a bug that could sporadically trigger a "The parameter is incorrect" error." 

"Just looked at the current hudson source, they're using jcifs-1.3.3 so they are behind and do not have this (as well as several other) update(s)."

I'll see about pushing this into the upstream bug tracker, and perhaps give a shot at integrating the newer version and rebuilding locally.


Update 1: filed an issue tracker entry here


Update 2: we've switched over to JNLP and used that to install a service, which is set to automatically start. This has been working without offline issues for a day or two now. Will keep watching the upstream bug to see if/when any activity happens there.

leander