climateprediction.net home page
Task 12311740

Task 12311740

Name hadam3p_pnw_zx4g_1975_1_007026248_0
Workunit 7229564
Created 24 Nov 2010, 17:14:26 UTC
Sent 12 Jan 2011, 12:14:15 UTC
Report deadline 25 Dec 2011, 17:34:15 UTC
Received 28 Jan 2011, 16:41:46 UTC
Server state Over
Outcome No reply
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1120237
Run time 2 days 15 hours 24 min 43 sec
CPU time 2 days 5 hours 29 min 33 sec
Validate state Invalid
Credit 502.72
Device peak FLOPS 0.77 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.08
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4332, selfPID=4332,CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1404, selfPID=5840, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2936, selfPID=2936, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4784, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4904, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5240, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5408, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5464, selfPID=908, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 0
23:20:49 (1472): No heartbeat from core client for 30 sec - exiting
23:20:50 (1472): No heartbeat from core client for 30 sec - exiting
23:20:51 (1472): No heartbeat from core client for 30 sec - exiting
23:20:52 (1472): No heartbeat from core client for 30 sec - exiting
23:21:03 (4832): Can't acquire lockfile (32) - waiting 35s
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3856, selfPID=3788, iMonCtr=1
Model crash detected, will try to restart...
01:42:39 (4324): Can't acquire lockfile (32) - waiting 35s
02:09:14 (772): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4696, selfPID=3664, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 0
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
15:25:37 (2788): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
21:10:58 (2972): No heartbeat from core client for 30 sec - exiting
21:10:59 (2972): No heartbeat from core client for 30 sec - exiting
21:11:00 (2972): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1404, selfPID=4864, iMonCtr=1
Model crash detected, will try to restart...
23:54:13 (3756): No heartbeat from core client for 30 sec - exiting
23:54:19 (3756): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4784, selfPID=2120, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4640, selfPID=3668, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2496, selfPID=2496, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4956, selfPID=4956, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2540, selfPID=4284, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2828, selfPID=3136, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5816, selfPID=1700, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5480, iMonCtr=2
20:24:45 (2156): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5612, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5508, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5568, selfPID=4400, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5244, selfPID=5564, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4944, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4968, selfPID=4216, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4584, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5684, selfPID=4632, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5104, selfPID=4700, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3304, selfPID=5084, iMonCtr=1
Model crash detected, will try to restart...
06:44:06 (4200): Can't acquire lockfile (32) - waiting 35s
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3876, selfPID=3876, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4556, selfPID=4556, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6036, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2568, selfPID=544, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4936, selfPID=4936, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5856, selfPID=5856, iMonCtr=2
22:08:59 (6016): Can't acquire lockfile (32) - waiting 35s
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3012, selfPID=3012, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
18:27:34 (5308): Can't acquire lockfile (32) - waiting 35s
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
C21:54:36 (228): Can't acquire lockfile (32) - waiting 35s
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5980, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6028, selfPID=4896, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5772, selfPID=5772, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
forrtl: Access is denied.

Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5700, selfPID=5700, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5700, selfPID=2344, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 2
forrtl: Access is denied.

Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3304, selfPID=512, iMonCtr=1
Model crash detected, will try to restart...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3304, selfPID=3304, iMonCtr=2
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 2
No Process Handle
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3424, iMonCtr=2
01:41:24 (512): called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_pnw_zx4g_1975_1_007026248_0_3.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_zx4g_1975_1_007026248_0_4.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_zx4g_1975_1_007026248_0_5.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_zx4g_1975_1_007026248_0_6.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_zx4g_1975_1_007026248_0_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_zx4g_1975_1_007026248_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_zx4g_1975_1_007026248_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_zx4g_1975_1_007026248_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_zx4g_1975_1_007026248_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_zx4g_1975_1_007026248_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
21 Jan 2011 14:54:46 1120237 12311740 hadam3p_pnw_zx4g_1975_1_007026248_0 23,136 147,981 6.3961
18 Jan 2011 06:12:12 1120237 12311740 hadam3p_pnw_zx4g_1975_1_007026248_0 11,616 78,338 6.7440


©2024 climateprediction.net