climateprediction.net home page
Task 16341414

Task 16341414

Name hadam3p_eu_i2ap_2013_1_008551681_0
Workunit 8699193
Created 5 Mar 2014, 16:41:46 UTC
Sent 7 Mar 2014, 19:33:29 UTC
Report deadline 18 Feb 2015, 0:53:29 UTC
Received 11 Mar 2014, 8:31:47 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1229790
Run time 2 days 3 hours 11 min 18 sec
CPU time 1 days 7 hours 51 min 56 sec
Validate state Invalid
Credit 597.84
Device peak FLOPS 2.31 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.2.33</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4076, iMonCtr=2
Model crash detected, will try to restart...
20:36:41 (6396): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4788, selfPID=11180, iMonCtr=1
Model crash detected, will try to restart...
01:18:04 (11960): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
01:18:05 (11960): No heartbeat from core client for 30 sec - exiting
01:51:14 (7916): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8568, selfPID=10128, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
20:12:27 (6208): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
C20:46:44 (6900): No heartbeat from core client for 30 sec - exiting
20:46:45 (6900): No heartbeat from core client for 30 sec - exiting
20:46:46 (6900): No heartbeat from core client for 30 sec - exiting
20:46:47 (6900): No heartbeat from core client for 30 sec - exiting
20:46:48 (6900): No heartbeat from core client for 30 sec - exiting
20:46:49 (6900): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
22:27:01 (8364): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=13160, selfPID=15984, iMonCtr=1
Model crash detected, will try to restart...
23:19:19 (1044): No heartbeat from core client for 30 sec - exiting
23:19:20 (1044): No heartbeat from core client for 30 sec - exiting
23:19:21 (1044): No heartbeat from core client for 30 sec - exiting
23:19:22 (1044): No heartbeat from core client for 30 sec - exiting
23:19:23 (1044): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8620, selfPID=4840, iMonCtr=1
Model crash detected, will try to restart...
23:39:53 (6892): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:39:54 (6892): No heartbeat from core client for 30 sec - exiting
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9220, selfPID=9220, iMonCtr=2
05:34:55 (8328): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=14236, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2684, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1660, selfPID=2536, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5484, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4832, selfPID=7028, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
23:42:49 (7272): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10696, selfPID=4584, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
04:42:33 (8860): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
05:19:14 (9796): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_i2ap_2013_1_008551681_0_4.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_i2ap_2013_1_008551681_0_5.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_i2ap_2013_1_008551681_0_6.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_i2ap_2013_1_008551681_0_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_i2ap_2013_1_008551681_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_i2ap_2013_1_008551681_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_i2ap_2013_1_008551681_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_i2ap_2013_1_008551681_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_i2ap_2013_1_008551681_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
09 Mar 2014 17:07:34 1229790 16341414 hadam3p_eu_i2ap_2013_1_008551681_0 34,656 86,801 2.5046
09 Mar 2014 06:35:13 1229790 16341414 hadam3p_eu_i2ap_2013_1_008551681_0 23,136 58,501 2.5286
08 Mar 2014 12:20:57 1229790 16341414 hadam3p_eu_i2ap_2013_1_008551681_0 11,616 29,144 2.5090


©2024 climateprediction.net