climateprediction.net home page
Task 16786535

Task 16786535

Name hadam3p_eu_k54i_2013_1_008866627_0
Workunit 9012556
Created 9 Jul 2014, 14:20:43 UTC
Sent 14 Jul 2014, 22:36:30 UTC
Report deadline 27 Jun 2015, 3:56:30 UTC
Received 21 Sep 2014, 19:27:29 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1310670
Run time 5 days 0 hours 59 min 27 sec
CPU time 4 days 2 hours 33 min 25 sec
Validate state Invalid
Credit 1,790.31
Device peak FLOPS 2.16 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.2.33</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9096, selfPID=9052, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=12244, selfPID=9284, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10552, selfPID=5668, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3144, selfPID=7756, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4768, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5228, selfPID=8004, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
16:52:00 (6028): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:52:03 (6028): No heartbeat from core client for 30 sec - exiting
16:52:04 (6028): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4940, selfPID=520, iMonCtr=1
Model crash detected, will try to restart...
GSuspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6408, selfPID=5668, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=12440, selfPID=1688, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6428, selfPID=7596, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6628, selfPID=9860, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
23:44:55 (5376): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:50:11 (9888): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4284, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
17:22:39 (11368): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:23:11 (10008): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8956, selfPID=8956, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6804, selfPID=1844, iMonCtr=1
Model crash detected, will try to restart...
21:17:34 (11484): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5648, selfPID=5648, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9776, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8456, iMonCtr=2
Model crash detected, will try to restart...
19:58:46 (11944): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:53:53 (2792): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4644, selfPID=10968, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6932, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7640, iMonCtr=2
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11172, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11400, iMonCtr=2
Model crash detected, will try to restart...
06:45:08 (7068): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:45:09 (7068): No heartbeat from core client for 30 sec - exiting
06:57:13 (9332): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:03:16 (7500): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8064, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7976, selfPID=8752, iMonCtr=1
Model crash detected, will try to restart...
17:01:29 (5872): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5260, selfPID=5260, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2616, iMonCtr=2
Model crash detected, will try to restart...
05:26:10 (6320): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6624, selfPID=5832, iMonCtr=1
Model crash detected, will try to restart...
06:43:46 (5428): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4548, selfPID=5308, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5992, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8068, selfPID=348, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5656, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5752, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1980, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5268, selfPID=792, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4372, selfPID=6252, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_k54i_2013_1_008866627_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_k54i_2013_1_008866627_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_k54i_2013_1_008866627_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
17 Sep 2014 10:03:21 1310670 16786535 hadam3p_eu_k54i_2013_1_008866627_0 103,782 329,758 3.1774
16 Sep 2014 11:15:27 1310670 16786535 hadam3p_eu_k54i_2013_1_008866627_0 103,776 329,225 3.1725
10 Sep 2014 21:26:07 1310670 16786535 hadam3p_eu_k54i_2013_1_008866627_0 92,256 293,162 3.1777
30 Aug 2014 20:52:47 1310670 16786535 hadam3p_eu_k54i_2013_1_008866627_0 80,743 256,758 3.1799
30 Aug 2014 14:13:25 1310670 16786535 hadam3p_eu_k54i_2013_1_008866627_0 80,736 256,227 3.1736
25 Aug 2014 01:44:24 1310670 16786535 hadam3p_eu_k54i_2013_1_008866627_0 69,216 219,791 3.1754
16 Aug 2014 13:14:01 1310670 16786535 hadam3p_eu_k54i_2013_1_008866627_0 57,696 182,349 3.1605
15 Aug 2014 10:47:47 1310670 16786535 hadam3p_eu_k54i_2013_1_008866627_0 46,176 145,087 3.1420
15 Aug 2014 10:47:47 1310670 16786535 hadam3p_eu_k54i_2013_1_008866627_0 34,656 107,945 3.1148
03 Aug 2014 16:40:11 1310670 16786535 hadam3p_eu_k54i_2013_1_008866627_0 23,136 72,546 3.1356
29 Jul 2014 16:39:59 1310670 16786535 hadam3p_eu_k54i_2013_1_008866627_0 11,616 36,237 3.1196


©2024 climateprediction.net