climateprediction.net home page
Task 12272688

Task 12272688

Name hadam3p_saf_1o8l_1980_1_006988045_0
Workunit 7191361
Created 24 Nov 2010, 9:14:54 UTC
Sent 24 Feb 2011, 16:44:24 UTC
Report deadline 6 Feb 2012, 22:04:24 UTC
Received 16 Mar 2011, 16:28:28 UTC
Server state Over
Outcome Didn't need
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1108089
Run time 4 days 3 hours 34 min 10 sec
CPU time 1 hours 43 min 58 sec
Validate state Invalid
Credit 562.19
Device peak FLOPS 2.01 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Southern Africa v6.08
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4356, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
10:28:41 (4532): No heartbeat from core client for 30 sec - exiting
10:28:42 (4532): No heartbeat from core client for 30 sec - exiting
10:28:43 (4532): No heartbeat from core client for 30 sec - exiting
10:28:44 (4532): No heartbeat from core client for 30 sec - exiting
10:28:45 (4532): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5168, selfPID=5168, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=828, selfPID=828, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3960, selfPID=5420, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4212, selfPID=4704, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5656, selfPID=5196, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2736, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5504, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4456, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4956, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5036, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5348, selfPID=5048, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1692, selfPID=4328, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5580, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3892, selfPID=4552, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5908, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5240, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4944, selfPID=5256, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5384, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5404, selfPID=1372, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5924, selfPID=5336, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
23:59:46 (5336): called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5564, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2392, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2688, selfPID=3580, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1568, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
18:14:11 (1568): called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_saf_1o8l_1980_1_006988045_0_4.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_1o8l_1980_1_006988045_0_5.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_1o8l_1980_1_006988045_0_6.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_1o8l_1980_1_006988045_0_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_1o8l_1980_1_006988045_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_1o8l_1980_1_006988045_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_1o8l_1980_1_006988045_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_1o8l_1980_1_006988045_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_1o8l_1980_1_006988045_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
12 Mar 2011 18:58:58 1108089 12272688 hadam3p_saf_1o8l_1980_1_006988045_0 34,656 131,633 3.7983
08 Mar 2011 19:28:36 1108089 12272688 hadam3p_saf_1o8l_1980_1_006988045_0 23,136 84,674 3.6598
08 Mar 2011 19:28:36 1108089 12272688 hadam3p_saf_1o8l_1980_1_006988045_0 11,616 47,529 4.0917


©2024 climateprediction.net