climateprediction.net home page
Task 14934311

Task 14934311

Name hadam3p_eu_a154_1976_1_008059860_0
Workunit 8214974
Created 18 Jul 2012, 6:25:44 UTC
Sent 18 Jul 2012, 6:31:06 UTC
Report deadline 30 Jun 2013, 11:51:06 UTC
Received 17 Aug 2012, 5:23:23 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1223423
Run time 2 days 7 hours 49 min 18 sec
CPU time 2 days 6 hours 19 min 1 sec
Validate state Invalid
Credit 1,194.08
Device peak FLOPS 2.87 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.0.31</core_client_version>
<![CDATA[
<stderr_txt>
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3104, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2748, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2608, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4820, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4828, selfPID=3464, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3344, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
22:46:19 (3992): No heartbeat from core client for 30 sec - exiting
22:46:20 (3992): No heartbeat from core client for 30 sec - exiting
22:46:21 (3992): No heartbeat from core client for 30 sec - exiting
22:46:23 (3992): No heartbeat from core client for 30 sec - exiting
22:46:24 (3992): No heartbeat from core client for 30 sec - exiting
22:46:25 (3992): No heartbeat from core client for 30 sec - exiting
22:46:26 (3992): No heartbeat from core client for 30 sec - exiting
22:46:27 (3992): No heartbeat from core client for 30 sec - exiting
22:46:28 (3992): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4652, selfPID=4308, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4688, selfPID=2740, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2188, selfPID=3912, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3548, selfPID=3804, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5056, selfPID=3536, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3256, selfPID=3904, iMonCtr=1
Model crash detected, will try to restart...
GSuspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4780, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4256, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4292, selfPID=3652, iMonCtr=1
Model crash detected, will try to restart...
05:50:16 (3680): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4728, selfPID=4728, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4496, selfPID=4496, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5084, selfPID=3736, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4736, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3804, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5092, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
01:25:34 (4040): No heartbeat from core client for 30 sec - exiting
01:25:35 (4040): No heartbeat from core client for 30 sec - exiting
01:25:36 (4040): No heartbeat from core client for 30 sec - exiting
01:25:37 (4040): No heartbeat from core client for 30 sec - exiting
01:25:38 (4040): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3992, iMonCtr=2
Model crash detected, will try to restart...

zip error: Could not create output file (was replacing the original zip file)

Model crashed: 
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_a154_1976_1_008059860_0_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_a154_1976_1_008059860_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_a154_1976_1_008059860_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_a154_1976_1_008059860_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_a154_1976_1_008059860_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_a154_1976_1_008059860_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
17 Aug 2012 05:24:14 1223423 14934311 hadam3p_eu_a154_1976_1_008059860_0 69,219 194,453 2.8092
17 Aug 2012 04:24:03 1223423 14934311 hadam3p_eu_a154_1976_1_008059860_0 69,216 194,088 2.8041
14 Aug 2012 02:45:18 1223423 14934311 hadam3p_eu_a154_1976_1_008059860_0 57,722 163,665 2.8354
13 Aug 2012 09:29:31 1223423 14934311 hadam3p_eu_a154_1976_1_008059860_0 57,712 163,249 2.8287
13 Aug 2012 09:29:31 1223423 14934311 hadam3p_eu_a154_1976_1_008059860_0 57,703 162,857 2.8223
13 Aug 2012 08:04:09 1223423 14934311 hadam3p_eu_a154_1976_1_008059860_0 57,696 162,478 2.8161
12 Aug 2012 03:44:22 1223423 14934311 hadam3p_eu_a154_1976_1_008059860_0 46,176 134,430 2.9113
08 Aug 2012 06:51:16 1223423 14934311 hadam3p_eu_a154_1976_1_008059860_0 34,656 105,926 3.0565
28 Jul 2012 05:48:37 1223423 14934311 hadam3p_eu_a154_1976_1_008059860_0 23,136 73,456 3.1750
20 Jul 2012 08:17:00 1223423 14934311 hadam3p_eu_a154_1976_1_008059860_0 11,620 38,630 3.3244
20 Jul 2012 05:56:33 1223423 14934311 hadam3p_eu_a154_1976_1_008059860_0 11,616 38,181 3.2869


©2024 climateprediction.net