climateprediction.net home page
Task 15707821

Task 15707821

Name hadam3p_eu_qfvv_2001_1_008346376_0
Workunit 8497237
Created 5 Apr 2013, 14:16:51 UTC
Sent 5 Apr 2013, 19:02:31 UTC
Report deadline 19 Mar 2014, 0:22:31 UTC
Received 20 Aug 2013, 13:38:05 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1253472
Run time 4 days 3 hours 46 min
CPU time 2 days 15 hours 20 min 21 sec
Validate state Invalid
Credit 995.30
Device peak FLOPS 1.85 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.0.64</core_client_version>
<![CDATA[
<stderr_txt>
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5052, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6904, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5188, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4132, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2444, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1772, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3100, selfPID=4808, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6356, selfPID=1028, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4264, selfPID=2620, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4020, selfPID=2928, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5776, selfPID=5776, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
09:03:42 (2392): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3316, selfPID=5200, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5888, selfPID=1932, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2828, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5540, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6700, selfPID=6712, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5100, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3716, selfPID=5344, iMonCtr=1
Model crash detected, will try to restart...
13:58:18 (1932): No heartbeat from core client for 30 sec - exiting
13:58:19 (1932): No heartbeat from core client for 30 sec - exiting
13:58:20 (1932): No heartbeat from core client for 30 sec - exiting
13:58:21 (1932): No heartbeat from core client for 30 sec - exiting
13:58:22 (1932): No heartbeat from core client for 30 sec - exiting
13:58:23 (1932): No heartbeat from core client for 30 sec - exiting
13:58:24 (1932): No heartbeat from core client for 30 sec - exiting
13:58:25 (1932): No heartbeat from core client for 30 sec - exiting
13:58:26 (1932): No heartbeat from core client for 30 sec - exiting
13:58:27 (1932): No heartbeat from core client for 30 sec - exiting
13:58:28 (1932): No heartbeat from core client for 30 sec - exiting
13:58:29 (1932): No heartbeat from core client for 30 sec - exiting
13:58:30 (1932): No heartbeat from core client for 30 sec - exiting
13:58:32 (1932): No heartbeat from core client for 30 sec - exiting
13:58:33 (1932): No heartbeat from core client for 30 sec - exiting
13:58:34 (1932): No heartbeat from core client for 30 sec - exiting
13:58:35 (1932): No heartbeat from core client for 30 sec - exiting
13:58:36 (1932): No heartbeat from core client for 30 sec - exiting
13:58:37 (1932): No heartbeat from core client for 30 sec - exiting
13:58:38 (1932): No heartbeat from core client for 30 sec - exiting
13:58:39 (1932): No heartbeat from core client for 30 sec - exiting
13:58:40 (1932): No heartbeat from core client for 30 sec - exiting
13:58:41 (1932): No heartbeat from core client for 30 sec - exiting
13:58:42 (1932): No heartbeat from core client for 30 sec - exiting
13:58:43 (1932): No heartbeat from core client for 30 sec - exiting
13:58:44 (1932): No heartbeat from core client for 30 sec - exiting
13:58:45 (1932): No heartbeat from core client for 30 sec - exiting
13:58:46 (1932): No heartbeat from core client for 30 sec - exiting
13:58:47 (1932): No heartbeat from core client for 30 sec - exiting
13:58:48 (1932): No heartbeat from core client for 30 sec - exiting
13:58:49 (1932): No heartbeat from core client for 30 sec - exiting
13:58:50 (1932): No heartbeat from core client for 30 sec - exiting
13:58:51 (1932): No heartbeat from core client for 30 sec - exiting
13:58:52 (1932): No heartbeat from core client for 30 sec - exiting
13:58:53 (1932): No heartbeat from core client for 30 sec - exiting
13:58:54 (1932): No heartbeat from core client for 30 sec - exiting
13:58:55 (1932): No heartbeat from core client for 30 sec - exiting
13:58:56 (1932): No heartbeat from core client for 30 sec - exiting
13:58:57 (1932): No heartbeat from core client for 30 sec - exiting
13:58:58 (1932): No heartbeat from core client for 30 sec - exiting
13:58:59 (1932): No heartbeat from core client for 30 sec - exiting
13:59:00 (1932): No heartbeat from core client for 30 sec - exiting
13:59:01 (1932): No heartbeat from core client for 30 sec - exiting
13:59:02 (1932): No heartbeat from core client for 30 sec - exiting
13:59:03 (1932): No heartbeat from core client for 30 sec - exiting
13:59:04 (1932): No heartbeat from core client for 30 sec - exiting
13:59:05 (1932): No heartbeat from core client for 30 sec - exiting
13:59:06 (1932): No heartbeat from core client for 30 sec - exiting
13:59:07 (1932): No heartbeat from core client for 30 sec - exiting
13:59:08 (1932): No heartbeat from core client for 30 sec - exiting
13:59:09 (1932): No heartbeat from core client for 30 sec - exiting
13:59:10 (1932): No heartbeat from core client for 30 sec - exiting
13:59:11 (1932): No heartbeat from core client for 30 sec - exiting
13:59:12 (1932): No heartbeat from core client for 30 sec - exiting
13:59:13 (1932): No heartbeat from core client for 30 sec - exiting
13:59:14 (1932): No heartbeat from core client for 30 sec - exiting
13:59:15 (1932): No heartbeat from core client for 30 sec - exiting
13:59:16 (1932): No heartbeat from core client for 30 sec - exiting
13:59:17 (1932): No heartbeat from core client for 30 sec - exiting
13:59:18 (1932): No heartbeat from core client for 30 sec - exiting
13:59:19 (1932): No heartbeat from core client for 30 sec - exiting
13:59:20 (1932): No heartbeat from core client for 30 sec - exiting
13:59:21 (1932): No heartbeat from core client for 30 sec - exiting
13:59:22 (1932): No heartbeat from core client for 30 sec - exiting
13:59:23 (1932): No heartbeat from core client for 30 sec - exiting
13:59:24 (1932): No heartbeat from core client for 30 sec - exiting
13:59:25 (1932): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
16:03:28 (2404): No heartbeat from core client for 30 sec - exiting
16:03:29 (2404): No heartbeat from core client for 30 sec - exiting
16:03:30 (2404): No heartbeat from core client for 30 sec - exiting
16:03:32 (2404): No heartbeat from core client for 30 sec - exiting
16:03:33 (2404): No heartbeat from core client for 30 sec - exiting
16:03:34 (2404): No heartbeat from core client for 30 sec - exiting
16:03:35 (2404): No heartbeat from core client for 30 sec - exiting
16:03:36 (2404): No heartbeat from core client for 30 sec - exiting
16:03:37 (2404): No heartbeat from core client for 30 sec - exiting
16:03:38 (2404): No heartbeat from core client for 30 sec - exiting
16:03:40 (2404): No heartbeat from core client for 30 sec - exiting
16:03:41 (2404): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6308, selfPID=728, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
10:37:54 (4280): No heartbeat from core client for 30 sec - exiting
10:37:55 (4280): No heartbeat from core client for 30 sec - exiting
10:37:57 (4280): No heartbeat from core client for 30 sec - exiting
10:37:58 (4280): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1020, iMonCtr=
2
del crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4880, selfPID=1752, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt><message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_6.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_qfvv_2001_1_008346376_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
17 Aug 2013 11:25:06 1253472 15707821 hadam3p_eu_qfvv_2001_1_008346376_0 57,696 207,036 3.5884
26 Jul 2013 15:56:56 1253472 15707821 hadam3p_eu_qfvv_2001_1_008346376_0 46,176 165,580 3.5858
23 Jul 2013 19:47:20 1253472 15707821 hadam3p_eu_qfvv_2001_1_008346376_0 34,656 124,292 3.5864
02 Jul 2013 11:48:52 1253472 15707821 hadam3p_eu_qfvv_2001_1_008346376_0 23,136 83,825 3.6231
10 Apr 2013 15:15:34 1253472 15707821 hadam3p_eu_qfvv_2001_1_008346376_0 11,616 42,514 3.6600


©2024 climateprediction.net