Name | hadam3p_eu_k1dh_2013_1_008535311_0 |
Workunit | 8682823 |
Created | 3 Mar 2014, 15:39:19 UTC |
Sent | 3 Mar 2014, 15:59:40 UTC |
Report deadline | 13 Feb 2015, 21:19:40 UTC |
Received | 5 Mar 2014, 13:59:23 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1229790 |
Run time | 23 hours 55 min 30 sec |
CPU time | 14 hours 42 min 14 sec |
Validate state | Invalid |
Credit | 200.52 |
Device peak FLOPS | 2.31 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>7.2.33</core_client_version> <![CDATA[ <stderr_txt> 12:52:20 (9272): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:16:29 (13120): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:38:32 (12764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=11316, selfPID=11316, iMonCtr=2 01:13:43 (2388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7768, selfPID=12204, iMonCtr=1 Model crash detected, will try to restart... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8572, iMonCtr=2 Model crash detected, will try to restart... 10:13:26 (7108): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:13:27 (7108): No heartbeat from core client for 30 sec - exiting 10:13:28 (7108): No heartbeat from core client for 30 sec - exiting 10:13:29 (7108): No heartbeat from core client for 30 sec - exiting 10:13:30 (7108): No heartbeat from core client for 30 sec - exiting 10:13:31 (7108): No heartbeat from core client for 30 sec - exiting 10:13:32 (7108): No heartbeat from core client for 30 sec - exiting 10:13:33 (7108): No heartbeat from core client for 30 sec - exiting Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8816, selfPID=8816, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3100, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7176, selfPID=11592, iMonCtr=1 Model crash detected, will try to restart... 10:58:07 (8568): No heartbeat from core client for 30 sec - exiting 10:58:08 (8568): No heartbeat from core client for 30 sec - exiting 10:58:09 (8568): No heartbeat from core client for 30 sec - exiting 10:58:10 (8568): No heartbeat from core client for 30 sec - exiting 10:58:11 (8568): No heartbeat from core client for 30 sec - exiting 10:58:12 (8568): No heartbeat from core client for 30 sec - exiting 10:58:13 (8568): No heartbeat from core client for 30 sec - exiting 10:58:14 (8568): No heartbeat from core client for 30 sec - exiting 10:58:15 (8568): No heartbeat from core client for 30 sec - exiting 10:58:16 (8568): No heartbeat from core client for 30 sec - exiting 10:58:17 (8568): No heartbeat from core client for 30 sec - exiting 10:58:18 (8568): No heartbeat from core client for 30 sec - exiting 10:58:19 (8568): No heartbeat from core client for 30 sec - exiting 10:58:20 (8568): No heartbeat from core client for 30 sec - exiting 10:58:22 (8568): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6696, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3264, selfPID=6076, iMonCtr=1 Model crash detected, will try to restart... 13:27:15 (11236): No heartbeat from core client for 30 sec - exiting 13:27:17 (11236): No heartbeat from core client for 30 sec - exiting 13:27:18 (11236): No heartbeat from core client for 30 sec - exiting 13:27:19 (11236): No heartbeat from core client for 30 sec - exiting 13:27:20 (11236): No heartbeat from core client for 30 sec - exiting 13:27:21 (11236): No heartbeat from core client for 30 sec - exiting 13:27:22 (11236): No heartbeat from core client for 30 sec - exiting 13:27:23 (11236): No heartbeat from core client for 30 sec - exiting 13:27:24 (11236): No heartbeat from core client for 30 sec - exiting 13:27:25 (11236): No heartbeat from core client for 30 sec - exiting 13:27:26 (11236): No heartbeat from core client for 30 sec - exiting 13:27:27 (11236): No heartbeat from core client for 30 sec - exiting 13:27:28 (11236): No heartbeat from core client for 30 sec - exiting 13:27:29 (11236): No heartbeat from core client for 30 sec - exiting 13:27:30 (11236): No heartbeat from core client for 30 sec - exiting 13:27:31 (11236): No heartbeat from core client for 30 sec - exiting 13:27:32 (11236): No heartbeat from core client for 30 sec - exiting 13:27:33 (11236): No heartbeat from core client for 30 sec - exiting 13:27:34 (11236): No heartbeat from core client for 30 sec - exiting 13:27:35 (11236): No heartbeat from core client for 30 sec - exiting 13:27:36 (11236): No heartbeat from core client for 30 sec - exiting 13:27:37 (11236): No heartbeat from core client for 30 sec - exiting 13:27:38 (11236): No heartbeat from core client for 30 sec - exiting 13:27:40 (11236): No heartbeat from core client for 30 sec - exiting 13:27:41 (11236): No heartbeat from core client for 30 sec - exiting 13:27:42 (11236): No heartbeat from core client for 30 sec - exiting 13:27:43 (11236): No heartbeat from core client for 30 sec - exiting 13:27:44 (11236): No heartbeat from core client for 30 sec - exiting 13:27:45 (11236): No heartbeat from core client for 30 sec - exiting 13:27:46 (11236): No heartbeat from core client for 30 sec - exiting 13:27:47 (11236): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:29:05 (8372): Can't acquire lockfile (32) - waiting 35s Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5796, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3540, selfPID=8372, iMonCtr=1 Model crash detected, will try to restart... 13:52:25 (10924): No heartbeat from core client for 30 sec - exiting 13:52:27 (10924): No heartbeat from core client for 30 sec - exiting 13:52:28 (10924): No heartbeat from core client for 30 sec - exiting 13:52:29 (10924): No heartbeat from core client for 30 sec - exiting 13:52:30 (10924): No heartbeat from core client for 30 sec - exiting 13:52:31 (10924): No heartbeat from core client for 30 sec - exiting 13:52:32 (10924): No heartbeat from core client for 30 sec - exiting 13:52:33 (10924): No heartbeat from core client for 30 sec - exiting 13:52:34 (10924): No heartbeat from core client for 30 sec - exiting 13:52:35 (10924): No heartbeat from core client for 30 sec - exiting 13:52:36 (10924): No heartbeat from core client for 30 sec - exiting 13:52:37 (10924): No heartbeat from core client for 30 sec - exiting 13:52:38 (10924): No heartbeat from core client for 30 sec - exiting 13:52:39 (10924): No heartbeat from core client for 30 sec - exiting 13:52:40 (10924): No heartbeat from core client for 30 sec - exiting 13:52:41 (10924): No heartbeat from core client for 30 sec - exiting 13:52:42 (10924): No heartbeat from core client for 30 sec - exiting 13:52:43 (10924): No heartbeat from core client for 30 sec - exiting 13:52:44 (10924): No heartbeat from core client for 30 sec - exiting 13:52:45 (10924): No heartbeat from core client for 30 sec - exiting 13:52:47 (10924): No heartbeat from core client for 30 sec - exiting 13:52:48 (10924): No heartbeat from core client for 30 sec - exiting 13:52:49 (10924): No heartbeat from core client for 30 sec - exiting 13:52:50 (10924): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10096, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7328, selfPID=12608, iMonCtr=1 Model crash detected, will try to restart... 20:41:00 (5924): No heartbeat from core client for 30 sec - exiting 20:41:01 (5924): No heartbeat from core client for 30 sec - exiting 20:41:02 (5924): No heartbeat from core client for 30 sec - exiting 20:41:03 (5924): No heartbeat from core client for 30 sec - exiting 20:41:04 (5924): No heartbeat from core client for 30 sec - exiting 20:41:05 (5924): No heartbeat from core client for 30 sec - exiting 20:41:06 (5924): No heartbeat from core client for 30 sec - exiting 20:41:07 (5924): No heartbeat from core client for 30 sec - exiting 20:41:08 (5924): No heartbeat from core client for 30 sec - exiting 20:41:09 (5924): No heartbeat from core client for 30 sec - exiting 20:41:10 (5924): No heartbeat from core client for 30 sec - exiting 20:41:11 (5924): No heartbeat from core client for 30 sec - exiting 20:41:12 (5924): No heartbeat from core client for 30 sec - exiting 20:41:13 (5924): No heartbeat from core client for 30 sec - exiting 20:41:14 (5924): No heartbeat from core client for 30 sec - exiting 20:41:15 (5924): No heartbeat from core client for 30 sec - exiting 20:41:16 (5924): No heartbeat from core client for 30 sec - exiting 20:41:17 (5924): No heartbeat from core client for 30 sec - exiting 20:41:19 (5924): No heartbeat from core client for 30 sec - exiting 20:41:20 (5924): No heartbeat from core client for 30 sec - exiting 20:41:21 (5924): No heartbeat from core client for 30 sec - exiting 20:41:22 (5924): No heartbeat from core client for 30 sec - exiting 20:41:23 (5924): No heartbeat from core client for 30 sec - exiting 20:41:24 (5924): No heartbeat from core client for 30 sec - exiting 20:41:25 (5924): No heartbeat from core client for 30 sec - exiting 20:41:26 (5924): No heartbeat from core client for 30 sec - exiting 20:41:27 (5924): No heartbeat from core client for 30 sec - exiting 20:41:28 (5924): No heartbeat from core client for 30 sec - exiting 20:41:29 (5924): No heartbeat from core client for 30 sec - exiting 20:41:30 (5924): No heartbeat from core client for 30 sec - exiting 20:41:31 (5924): No heartbeat from core client for 30 sec - exiting 20:41:32 (5924): No heartbeat from core client for 30 sec - exiting 20:41:33 (5924): No heartbeat from core client for 30 sec - exiting 20:41:34 (5924): No heartbeat from core client for 30 sec - exiting 20:41:35 (5924): No heartbeat from core client for 30 sec - exiting 20:41:36 (5924): No heartbeat from core client for 30 sec - exiting 20:41:37 (5924): No heartbeat from core client for 30 sec - exiting 20:41:38 (5924): No heartbeat from core client for 30 sec - exiting 20:41:39 (5924): No heartbeat from core client for 30 sec - exiting 20:41:41 (5924): No heartbeat from core client for 30 sec - exiting 20:41:42 (5924): No heartbeat from core client for 30 sec - exiting 20:41:43 (5924): No heartbeat from core client for 30 sec - exiting 20:41:44 (5924): No heartbeat from core client for 30 sec - exiting 20:41:45 (5924): No heartbeat from core client for 30 sec - exiting 20:41:46 (5924): No heartbeat from core client for 30 sec - exiting 20:41:47 (5924): No heartbeat from core client for 30 sec - exiting 20:41:48 (5924): No heartbeat from core client for 30 sec - exiting 20:41:49 (5924): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:29:05 (2960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:29:06 (2960): No heartbeat from core client for 30 sec - exiting 00:29:16 (10744): Can't acquire lockfile (32) - waiting 35s 00:50:15 (10744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:31:35 (9452): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:31:36 (9452): No heartbeat from core client for 30 sec - exiting 02:31:37 (9452): No heartbeat from core client for 30 sec - exiting 02:31:38 (9452): No heartbeat from core client for 30 sec - exiting 02:31:39 (9452): No heartbeat from core client for 30 sec - exiting 03:17:10 (11600): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:35:20 (11920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10472, selfPID=9096, iMonCtr=1 Model crash detected, will try to restart... 05:03:49 (576): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO tmp/xaakm.pipe_dummy 2048 Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_k1dh_2013_1_008535311_0_2.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k1dh_2013_1_008535311_0_3.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k1dh_2013_1_008535311_0_4.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k1dh_2013_1_008535311_0_5.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k1dh_2013_1_008535311_0_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k1dh_2013_1_008535311_0_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k1dh_2013_1_008535311_0_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k1dh_2013_1_008535311_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k1dh_2013_1_008535311_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k1dh_2013_1_008535311_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k1dh_2013_1_008535311_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
04 Mar 2014 16:10:55 | 1229790 | 16317412 | hadam3p_eu_k1dh_2013_1_008535311_0 | 11,624 | 28,810 | 2.4785 |
04 Mar 2014 15:55:16 | 1229790 | 16317412 | hadam3p_eu_k1dh_2013_1_008535311_0 | 11,616 | 28,458 | 2.4499 |
©2024 climateprediction.net