climateprediction.net home page
Task 15179400

Task 15179400

Name hadam3p_eu_wmms_1963_1_006863084_1
Workunit 7066400
Created 23 Aug 2012, 15:01:59 UTC
Sent 30 Aug 2012, 11:47:31 UTC
Report deadline 12 Aug 2013, 17:07:31 UTC
Received 6 Sep 2012, 3:49:14 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1231846
Run time 1 days 18 hours 38 min 11 sec
CPU time 1 days 12 hours 47 min 6 sec
Validate state Invalid
Credit 796.57
Device peak FLOPS 2.94 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
21:41:49 (5268): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3740, selfPID=3740, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1456, selfPID=1040, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4696, selfPID=4752, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4540, selfPID=3968, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5616, selfPID=2688, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4588, selfPID=3776, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
19:44:34 (5036): No heartbeat from core client for 30 sec - exiting
19:44:36 (5036): No heartbeat from core client for 30 sec - exiting
19:44:37 (5036): No heartbeat from core client for 30 sec - exiting
19:44:38 (5036): No heartbeat from core client for 30 sec - exiting
19:44:39 (5036): No heartbeat from core client for 30 sec - exiting
19:44:40 (5036): No heartbeat from core client for 30 sec - exiting
19:44:41 (5036): No heartbeat from core client for 30 sec - exiting
19:44:42 (5036): No heartbeat from core client for 30 sec - exiting
19:44:43 (5036): No heartbeat from core client for 30 sec - exiting
19:44:44 (5036): No heartbeat from core client for 30 sec - exiting
19:44:45 (5036): No heartbeat from core client for 30 sec - exiting
19:44:46 (5036): No heartbeat from core client for 30 sec - exiting
19:44:47 (5036): No heartbeat from core client for 30 sec - exiting
19:44:48 (5036): No heartbeat from core client for 30 sec - exiting
19:44:49 (5036): No heartbeat from core client for 30 sec - exiting
19:44:50 (5036): No heartbeat from core client for 30 sec - exiting
19:44:51 (5036): No heartbeat from core client for 30 sec - exiting
19:44:52 (5036): No heartbeat from core client for 30 sec - exiting
19:44:53 (5036): No heartbeat from core client for 30 sec - exiting
19:44:54 (5036): No heartbeat from core client for 30 sec - exiting
19:44:55 (5036): No heartbeat from core client for 30 sec - exiting
19:44:56 (5036): No heartbeat from core client for 30 sec - exiting
19:44:57 (5036): No heartbeat from core client for 30 sec - exiting
19:44:58 (5036): No heartbeat from core client for 30 sec - exiting
19:44:59 (5036): No heartbeat from core client for 30 sec - exiting
19:45:00 (5036): No heartbeat from core client for 30 sec - exiting
19:45:01 (5036): No heartbeat from core client for 30 sec - exiting
19:45:02 (5036): No heartbeat from core client for 30 sec - exiting
19:45:03 (5036): No heartbeat from core client for 30 sec - exiting
19:45:04 (5036): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5372, selfPID=4104, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5748, selfPID=3888, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3680, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=672, selfPID=5016, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6036, selfPID=2548, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_wmms_1963_1_006863084/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_wmms_1963_1_006863084/dataout/region_restart.day after 11 attempts
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_wmms_1963_1_006863084\tmp\xaakg.namelists

Image              PC        Routine            Line        Source             
hadrm3p_eu_um_6.0  010BC52A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01064460  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  0106362A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01042469  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  00F466EB  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  00FE2AE2  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  00FE35AF  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  00D89860  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  010A0893  Unknown               Unknown  Unknown
kernel32.dll       7542339A  Unknown               Unknown  Unknown
ntdll.dll          77209EF2  Unknown               Unknown  Unknown
ntdll.dll          77209EC5  Unknown               Unknown  Unknown
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_wmms_1963_1_006863084\tmp\xaakm.namelists

Image              PC        Routine            Line        Source             
hadam3p_eu_um_6.0  00FEA39A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00F92CD0  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00F91E9A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00F72819  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00E72287  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00F0E7B2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00F0F2DA  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00C89BD2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00FCE638  Unknown               Unknown  Unknown
kernel32.dll       7542339A  Unknown               Unknown  Unknown
ntdll.dll          77209EF2  Unknown               Unknown  Unknown
ntdll.dll          77209EC5  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4396, selfPID=4628, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_wmms_1963_1_006863084_1_5.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wmms_1963_1_006863084_1_6.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wmms_1963_1_006863084_1_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wmms_1963_1_006863084_1_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wmms_1963_1_006863084_1_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wmms_1963_1_006863084_1_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wmms_1963_1_006863084_1_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wmms_1963_1_006863084_1_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
04 Sep 2012 13:52:17 1231846 15179400 hadam3p_eu_wmms_1963_1_006863084_1 46,176 110,256 2.3877
03 Sep 2012 04:05:10 1231846 15179400 hadam3p_eu_wmms_1963_1_006863084_1 34,656 83,756 2.4168
02 Sep 2012 01:04:01 1231846 15179400 hadam3p_eu_wmms_1963_1_006863084_1 23,136 55,669 2.4062
01 Sep 2012 05:49:50 1231846 15179400 hadam3p_eu_wmms_1963_1_006863084_1 11,616 27,775 2.3911


©2024 climateprediction.net