climateprediction.net home page
Task 16441731

Task 16441731

Name hadam3p_anz_n0og_2012_1_008576309_2
Workunit 8722821
Created 2 Apr 2014, 7:01:56 UTC
Sent 2 Apr 2014, 7:41:17 UTC
Report deadline 15 Mar 2015, 13:01:17 UTC
Received 6 May 2014, 13:54:05 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1311870
Run time 5 days 5 hours 1 min 21 sec
CPU time 6 hours 43 min 43 sec
Validate state Invalid
Credit 2,497.00
Device peak FLOPS 2.26 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10
windows_intelx86
Stderr
<core_client_version>7.2.33</core_client_version>
<![CDATA[
<stderr_txt>
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4076, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3480, selfPID=2748, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
11:11:14 (5948): No heartbeat from core client for 30 sec - exiting
11:11:15 (5948): No heartbeat from core client for 30 sec - exiting
11:11:16 (5948): No heartbeat from core client for 30 sec - exiting
11:11:17 (5948): No heartbeat from core client for 30 sec - exiting
11:11:18 (5948): No heartbeat from core client for 30 sec - exiting
11:11:19 (5948): No heartbeat from core client for 30 sec - exiting
11:11:20 (5948): No heartbeat from core client for 30 sec - exiting
11:11:21 (5948): No heartbeat from core client for 30 sec - exiting
11:11:22 (5948): No heartbeat from core client for 30 sec - exiting
11:11:23 (5948): No heartbeat from core client for 30 sec - exiting
11:11:24 (5948): No heartbeat from core client for 30 sec - exiting
11:11:25 (5948): No heartbeat from core client for 30 sec - exiting
11:11:26 (5948): No heartbeat from core client for 30 sec - exiting
11:11:27 (5948): No heartbeat from core client for 30 sec - exiting
11:11:28 (5948): No heartbeat from core client for 30 sec - exiting
11:11:29 (5948): No heartbeat from core client for 30 sec - exiting
11:11:30 (5948): No heartbeat from core client for 30 sec - exiting
11:11:31 (5948): No heartbeat from core client for 30 sec - exiting
11:11:32 (5948): No heartbeat from core client for 30 sec - exiting
11:11:33 (5948): No heartbeat from core client for 30 sec - exiting
11:11:34 (5948): No heartbeat from core client for 30 sec - exiting
11:11:35 (5948): No heartbeat from core client for 30 sec - exiting
11:11:36 (5948): No heartbeat from core client for 30 sec - exiting
11:11:37 (5948): No heartbeat from core client for 30 sec - exiting
11:11:38 (5948): No heartbeat from core client for 30 sec - exiting
11:11:39 (5948): No heartbeat from core client for 30 sec - exiting
11:11:40 (5948): No heartbeat from core client for 30 sec - exiting
11:11:41 (5948): No heartbeat from core client for 30 sec - exiting
11:11:42 (5948): No heartbeat from core client for 30 sec - exiting
11:11:43 (5948): No heartbeat from core client for 30 sec - exiting
11:11:44 (5948): No heartbeat from core client for 30 sec - exiting
11:11:45 (5948): No heartbeat from core client for 30 sec - exiting
11:11:46 (5948): No heartbeat from core client for 30 sec - exiting
11:11:47 (5948): No heartbeat from core client for 30 sec - exiting
11:11:48 (5948): No heartbeat from core client for 30 sec - exiting
11:11:49 (5948): No heartbeat from core client for 30 sec - exiting
11:11:50 (5948): No heartbeat from core client for 30 sec - exiting
11:11:51 (5948): No heartbeat from core client for 30 sec - exiting
11:11:52 (5948): No heartbeat from core client for 30 sec - exiting
11:11:53 (5948): No heartbeat from core client for 30 sec - exiting
11:11:54 (5948): No heartbeat from core client for 30 sec - exiting
11:11:55 (5948): No heartbeat from core client for 30 sec - exiting
11:11:56 (5948): No heartbeat from core client for 30 sec - exiting
11:11:57 (5948): No heartbeat from core client for 30 sec - exiting
11:11:58 (5948): No heartbeat from core client for 30 sec - exiting
11:11:59 (5948): No heartbeat from core client for 30 sec - exiting
11:12:00 (5948): No heartbeat from core client for 30 sec - exiting
11:12:01 (5948): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5900, selfPID=4032, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5364, selfPID=5300, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5468, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4984, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
17:35:07 (4152): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6084, selfPID=6084, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2200, selfPID=5548, iMonCtr=1
Model crash detected, will try to restart...
Global Worker :P: CPDN pro ess is not running, eng, bRetVal = l = 1, chPckPID=0, fPID=2ID=8, iMonCtr=2
r=ode
l crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4660, selfPID=5288, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
09:31:57 (3260): No heartbeat from core client for 30 sec - exiting
09:31:58 (3260): No heartbeat from core client for 30 sec - exiting
09:31:59 (3260): No heartbeat from core client for 30 sec - exiting
09:32:00 (3260): No heartbeat from core client for 30 sec - exiting
09:32:01 (3260): No heartbeat from core client for 30 sec - exiting
09:32:02 (3260): No heartbeat from core client for 30 sec - exiting
09:32:03 (3260): No heartbeat from core client for 30 sec - exiting
09:32:04 (3260): No heartbeat from core client for 30 sec - exiting
09:32:05 (3260): No heartbeat from core client for 30 sec - exiting
09:32:06 (3260): No heartbeat from core client for 30 sec - exiting
09:32:07 (3260): No heartbeat from core client for 30 sec - exiting
09:32:08 (3260): No heartbeat from core client for 30 sec - exiting
09:32:09 (3260): No heartbeat from core client for 30 sec - exiting
09:32:10 (3260): No heartbeat from core client for 30 sec - exiting
09:32:11 (3260): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2908, iMonCtr=2
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2916, selfPID=2564, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
12:08:15 (3832): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:08:18 (3832): No heartbeat from core client for 30 sec - exiting
12:08:19 (3832): No heartbeat from core client for 30 sec - exiting
12:08:20 (3832): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=12012, selfPID=12012, iMonCtr=2
11:29:50 (3820): No heartbeat from core client for 30 sec - exiting
11:29:51 (3820): No heartbeat from core client for 30 sec - exiting
11:29:52 (3820): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4812, selfPID=2616, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4224, selfPID=3512, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3656, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5780, selfPID=6088, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
14:44:17 (4744): No heartbeat from core client for 30 sec - exiting
14:44:18 (4744): No heartbeat from core client for 30 sec - exiting
14:44:19 (4744): No heartbeat from core client for 30 sec - exiting
14:44:20 (4744): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2676, selfPID=2676, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2992, selfPID=2992, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5480, selfPID=4484, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_anz_n0og_2012_1_008576309/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_anz_n0og_2012_1_008576309/dataout/region_restart.day after 11 attempts

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakg.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_anz_n0og_2012_1_008576309_2_6.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n0og_2012_1_008576309_2_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n0og_2012_1_008576309_2_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n0og_2012_1_008576309_2_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n0og_2012_1_008576309_2_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n0og_2012_1_008576309_2_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_anz_n0og_2012_1_008576309_2_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
27 Apr 2014 15:28:03 1311870 16441731 hadam3p_anz_n0og_2012_1_008576309_2 57,899 248,726 4.2959
21 Apr 2014 13:27:07 1311870 16441731 hadam3p_anz_n0og_2012_1_008576309_2 46,379 199,441 4.3002
19 Apr 2014 19:44:43 1311870 16441731 hadam3p_anz_n0og_2012_1_008576309_2 34,859 150,814 4.3264
17 Apr 2014 15:23:54 1311870 16441731 hadam3p_anz_n0og_2012_1_008576309_2 23,339 101,109 4.3322
03 Apr 2014 19:12:01 1311870 16441731 hadam3p_anz_n0og_2012_1_008576309_2 11,819 51,982 4.3982


©2024 climateprediction.net