I'm working on a batch system (Torque), now the important part is this: When an interactive job is run, the submit tool will communicate with the execution host. This fails on one of the machines.
Here is the strace output:
Sumbit tool:
16:18:36.219925 fcntl(4, F_GETFL) = 0x2 (flags O_RDWR)
16:18:36.219925 read(4, "610.torque1.ics.muni.cz\0\0\0\0\0\0\0\0\0\0"..., 16385) = 1046
16:18:36.219925 write(4, "TERM=xterm\0\0\0\0\0\0\220\5\377\377\377\177\0\0\214\303u\310\277\177\0\0\26"..., 80) = 80
16:18:36.219925 write(4, "\3\34\177\25\4\32"..., 6) = 6
16:18:36.219925 write(4, "WINSIZE 46,166,0,0\0\0\0\0\0\0\1@\0\0\0\0\0\0\0"..., 80) = 80
16:18:36.219925 write(1, "qsub: job 610.torque1.ics.muni.cz"..., 41qsub: job 610.torque1.ics.muni.cz ready ) = 41
16:18:36.219925 rt_sigaction(SIGINT, {SIG_IGN}, NULL, 8) = 0
16:18:36.219925 rt_sigaction(SIGTERM, {SIG_IGN}, NULL, 8) = 0
16:18:36.219925 rt_sigaction(SIGALRM, {SIG_IGN}, NULL, 8) = 0
16:18:36.219925 rt_sigaction(SIGTSTP, {SIG_IGN}, NULL, 8) = 0
16:18:36.219925 clone(Process 20724 attached child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fbfc9a2d770) = 20724
Execution host part:
[pid 8778] 15:59:16.371145 getsockopt(3, SOL_SOCKET, SO_ERROR, [4294967296], [4]) = 0
[pid 8778] 15:59:16.371145 fcntl(3, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
[pid 8778] 15:59:16.371145 fcntl(3, F_SETFL, O_RDWR) = 0
[pid 8778] 15:59:16.371145 write(3, "609.torque1.ics.muni.cz\0\0\0\0\0\0\0\0\0\0"..., 1046) = 1046
[pid 8778] 15:59:16.371145 fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
[pid 8778] 15:59:16.375144 read(3, 0x717ae0, 80) = ? ERESTARTSYS (To be restarted)
[pid 8778] 15:59:21.367024 --- SIGALRM (Alarm clock) @ 0 (0) ---
[pid 8778] 15:59:21.367024 rt_sigreturn(0x8) = -1 EINTR (Interrupted system call)
[pid 8778] 15:59:21.367024 ioctl(2, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
[pid 8778] 15:59:21.367024 write(2, "pbs_mom: LOG_ERROR::Interrupted s"
Edit: These are actually separate runs, but the output is always exactly the same.