views:

30

answers:

2

Hi Everyone,

I've got this regular problem every morning that my build server (Hudson) is always stopped every morning so I have to manually start it, is there any reason why or any location that i can started to look for the error message ?

Thanks.

here's the error diagnostic that i did:

ascari:~# ps -ef | grep -i hud
root      5959  5944  0 09:00 pts/0    00:00:00 grep -i hud

ascari:~# cd /etc/init.d

ascari:/etc/init.d# ./hudson start

ascari:/etc/init.d# ps -ef | grep -i hud
hudson    6004     1  0 09:00 ?        00:00:00 /usr/bin/daemon --name=hudson --    inherit --env=HUDSON_HOME=/var/lib/hudson --output=/var/log/hudson/hudson.log --    user=hudson --pidfile=/var/run/hudson/hudson.pid -- /usr/bin/java -Xms512m -Xmx1    024m -Dhttp.proxyHost=proxy.domain.com -Dhttp.proxyPort=3128 -Dhttp.nonProxyHo    sts="localhost|ascari|*.domain.com" -jar /usr/share/hudson/hudson.war --webroo    t=/var/run/hudson/war
hudson    6005  6004 48 09:00 ?        00:00:01 /usr/bin/java -Xms512m -Xmx1024m     -Dhttp.proxyHost=proxy.domain.com -Dhttp.proxyPort=3128 -Dhttp.nonProxyHosts=    "localhost|ascari|*.domain.com" -jar /usr/share/hudson/hudson.war --webroot=/v    ar/run/hudson/war
root      6008  5944 14 09:01 pts/0    00:00:00 grep -i hud

ascari:/etc/init.d# df -k -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             327M  125M  185M  41% /
tmpfs                 1.5G     0  1.5G   0% /lib/init/rw
udev                   10M   96K   10M   1% /dev
tmpfs                 1.5G     0  1.5G   0% /dev/shm
/dev/sda9             4.7G  295M  4.1G   7% /home
/dev/sda8             4.2G  155M  3.8G   4% /tmp
/dev/sda5             4.6G  3.0G  1.4G  69% /usr
/dev/sda6              65G   32G   30G  52% /var

ascari:/etc/init.d# uname -a
Linux ascari 2.6.26-2-686 #1 SMP Sun Jun 21 04:57:38 UTC 2009 i686 GNU/Linux

ascari:/etc/init.d#

+1  A: 

Have you checked the logfile (referenced above) and set the --logfile argument (as documented here) ?

Brian Agnew
ah..., where is the log file referenced location ?
Albert Widjaja
Thanks man, it seems that the error log is in /var/log/message, where it said:[<c015903e>] oom_kill_process+0x4f/0x195[<c0159468>] out_of_memory+0x14e/0x17f[<c02b99ca>] error_code+0x72/0x78 it caused by too many projects compiled at the same time therefore rescheduling every project to have interval of 30 minutes do the trick and it solved now.
Albert Widjaja
Excellent! Glad that sorted it
Brian Agnew
A: 

Rescheduling the project build solve the problem. The Hudson process was killed by the Linux kernel due to the memory over consumption.

Albert Widjaja