climateprediction.net home page
Task 16787553

Task 16787553

Name hadam3p_eu_k5ws_2013_1_008867645_0
Workunit 9013574
Created 9 Jul 2014, 14:20:59 UTC
Sent 14 Jul 2014, 15:09:26 UTC
Report deadline 26 Jun 2015, 20:29:26 UTC
Received 1 Aug 2014, 9:01:24 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1333820
Run time 4 days 13 hours 2 min 3 sec
CPU time 3 days 11 hours 8 min 35 sec
Validate state Invalid
Credit 2,187.67
Device peak FLOPS 2.43 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=788, selfPID=788, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=71636, selfPID=70440, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11608, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=11748, selfPID=5852, iMonCtr=1
Model crash detected, will try to restart...
13:52:18 (5392): No heartbeat from core client for 30 sec - exiting
13:52:19 (5392): No heartbeat from core client for 30 sec - exiting
13:52:20 (5392): No heartbeat from core client for 30 sec - exiting
13:52:21 (5392): No heartbeat from core client for 30 sec - exiting
13:52:22 (5392): No heartbeat from core client for 30 sec - exiting
13:52:23 (5392): No heartbeat from core client for 30 sec - exiting
13:52:24 (5392): No heartbeat from core client for 30 sec - exiting
13:52:25 (5392): No heartbeat from core client for 30 sec - exiting
13:52:26 (5392): No heartbeat from core client for 30 sec - exiting
13:52:27 (5392): No heartbeat from core client for 30 sec - exiting
13:52:28 (5392): No heartbeat from core client for 30 sec - exiting
13:52:29 (5392): No heartbeat from core client for 30 sec - exiting
13:52:30 (5392): No heartbeat from core client for 30 sec - exiting
13:52:31 (5392): No heartbeat from core client for 30 sec - exiting
13:52:32 (5392): No heartbeat from core client for 30 sec - exiting
13:52:33 (5392): No heartbeat from core client for 30 sec - exiting
13:52:34 (5392): No heartbeat from core client for 30 sec - exiting
13:52:35 (5392): No heartbeat from core client for 30 sec - exiting
13:52:36 (5392): No heartbeat from core client for 30 sec - exiting
13:52:37 (5392): No heartbeat from core client for 30 sec - exiting
13:52:38 (5392): No heartbeat from core client for 30 sec - exiting
13:52:39 (5392): No heartbeat from core client for 30 sec - exiting
13:52:40 (5392): No heartbeat from core client for 30 sec - exiting
13:52:41 (5392): No heartbeat from core client for 30 sec - exiting
13:52:42 (5392): No heartbeat from core client for 30 sec - exiting
13:52:43 (5392): No heartbeat from core client for 30 sec - exiting
13:52:44 (5392): No heartbeat from core client for 30 sec - exiting
13:52:45 (5392): No heartbeat from core client for 30 sec - exiting
13:52:46 (5392): No heartbeat from core client for 30 sec - exiting
13:52:47 (5392): No heartbeat from core client for 30 sec - exiting
13:52:48 (5392): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5660, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8000, selfPID=3216, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7888, selfPID=5344, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7632, selfPID=4492, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8112, selfPID=5236, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2212, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3680, selfPID=5468, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7688, selfPID=5656, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7856, selfPID=6228, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6912, selfPID=3488, iMonCtr=1
Model crash detected, will try to restart...
06:14:50 (7756): No heartbeat from core client for 30 sec - exiting
06:14:51 (7756): No heartbeat from core client for 30 sec - exiting
06:14:52 (7756): No heartbeat from core client for 30 sec - exiting
06:14:53 (7756): No heartbeat from core client for 30 sec - exiting
06:14:54 (7756): No heartbeat from core client for 30 sec - exiting
06:14:55 (7756): No heartbeat from core client for 30 sec - exiting
06:14:56 (7756): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:14:57 (7756): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7256, selfPID=5180, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4628, selfPID=4568, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5764, selfPID=2940, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7448, selfPID=2288, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4976, selfPID=4232, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6232, selfPID=6232, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7788, selfPID=7788, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3464, selfPID=3464, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7016, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7040, selfPID=3296, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7036, iMonCtr=2
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9000, selfPID=9000, iMonCtr=2
11:57:40 (5892): No heartbeat from core client for 30 sec - exiting
11:57:41 (5892): No heartbeat from core client for 30 sec - exiting
11:57:42 (5892): No heartbeat from core client for 30 sec - exiting
11:57:43 (5892): No heartbeat from core client for 30 sec - exiting
11:57:44 (5892): No heartbeat from core client for 30 sec - exiting
11:57:45 (5892): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8028, selfPID=5488, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5652, selfPID=5652, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8176, selfPID=8108, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4176, selfPID=4880, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3296, selfPID=4996, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_k5ws_2013_1_008867645/dataout/atmos_restart.day after 11 attempts
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_k5ws_2013_1_008867645\tmp\xaakg.namelists

Image              PC        Routine            Line        Source             
hadrm3p_eu_um_6.0  016AC52A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01654460  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  0165362A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01632469  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  015366EB  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  015D2AE2  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  015D35AF  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01379860  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01690893  Unknown               Unknown  Unknown
kernel32.dll       761E338A  Unknown               Unknown  Unknown
ntdll.dll          772F9F72  Unknown               Unknown  Unknown
ntdll.dll          772F9F45  Unknown               Unknown  Unknown
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_k5ws_2013_1_008867645\tmp\xaakm.namelists

Image              PC        Routine            Line        Source             
hadam3p_eu_um_6.0  00BFA39A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00BA2CD0  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00BA1E9A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00B82819  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00A82287  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00B1E7B2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00B1F2DA  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00899BD2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00BDE638  Unknown               Unknown  Unknown
kernel32.dll       761E338A  Unknown               Unknown  Unknown
ntdll.dll          772F9F72  Unknown               Unknown  Unknown
ntdll.dll          772F9F45  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2664, selfPID=3148, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_k5ws_2013_1_008867645_0_12.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
31 Jul 2014 13:15:25 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 126,816 295,042 2.3265
29 Jul 2014 17:31:20 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 115,317 268,052 2.3245
29 Jul 2014 11:10:05 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 115,296 267,656 2.3215
24 Jul 2014 19:24:33 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 103,776 240,484 2.3173
22 Jul 2014 16:16:25 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 92,274 213,298 2.3116
22 Jul 2014 16:16:25 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 92,256 212,926 2.3080
21 Jul 2014 08:48:46 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 80,736 186,335 2.3080
19 Jul 2014 21:21:29 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 69,216 159,300 2.3015
19 Jul 2014 11:01:52 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 57,797 131,765 2.2798
19 Jul 2014 08:16:05 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 57,696 131,204 2.2741
18 Jul 2014 13:46:28 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 46,176 104,107 2.2546
17 Jul 2014 20:13:17 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 34,656 78,542 2.2663
15 Jul 2014 18:45:53 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 23,136 52,610 2.2739
15 Jul 2014 04:41:17 1333820 16787553 hadam3p_eu_k5ws_2013_1_008867645_0 11,616 26,835 2.3102


©2024 climateprediction.net