climateprediction.net home page
Task 12641184

Task 12641184

Name hadam3p_pnw_32ie_1984_1_007183406_1
Workunit 7381688
Created 27 Feb 2011, 21:43:03 UTC
Sent 27 Feb 2011, 22:41:45 UTC
Report deadline 10 Feb 2012, 4:01:45 UTC
Received 10 Mar 2011, 19:23:27 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1109813
Run time 4 days 15 hours 44 min 53 sec
CPU time 3 days 6 hours 48 min 34 sec
Validate state Invalid
Credit 2,505.24
Device peak FLOPS 2.94 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.08
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4660, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 3
18:37:41 (4660): called boinc_finish
19:37:54 (5736): No heartbeat from core client for 30 sec - exiting
19:37:55 (5736): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5648, selfPID=4512, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 1
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5368, selfPID=4952, iMonCtr=1
Model crash detected, will try to restart...
09:23:03 (4312): No heartbeat from core client for 30 sec - exiting
09:23:04 (4312): No heartbeat from core client for 30 sec - exiting
09:23:05 (4312): No heartbeat from core client for 30 sec - exiting
09:23:06 (4312): No heartbeat from core client for 30 sec - exiting
09:23:07 (4312): No heartbeat from core client for 30 sec - exiting
09:23:08 (4312): No heartbeat from core client for 30 sec - exiting
09:23:09 (4312): No heartbeat from core client for 30 sec - exiting
09:23:10 (4312): No heartbeat from core client for 30 sec - exiting
09:23:12 (4312): No heartbeat from core client for 30 sec - exiting
09:23:13 (4312): No heartbeat from core client for 30 sec - exiting
09:23:14 (4312): No heartbeat from core client for 30 sec - exiting
09:23:15 (4312): No heartbeat from core client for 30 sec - exiting
09:23:16 (4312): No heartbeat from core client for 30 sec - exiting
09:23:17 (4312): No heartbeat from core client for 30 sec - exiting
09:23:18 (4312): No heartbeat from core client for 30 sec - exiting
09:23:19 (4312): No heartbeat from core client for 30 sec - exiting
09:23:20 (4312): No heartbeat from core client for 30 sec - exiting
09:23:21 (4312): No heartbeat from core client for 30 sec - exiting
09:23:22 (4312): No heartbeat from core client for 30 sec - exiting
09:23:24 (4312): No heartbeat from core client for 30 sec - exiting
09:23:25 (4312): No heartbeat from core client for 30 sec - exiting
09:23:26 (4312): No heartbeat from core client for 30 sec - exiting
09:23:27 (4312): No heartbeat from core client for 30 sec - exiting
09:23:28 (4312): No heartbeat from core client for 30 sec - exiting
09:23:29 (4312): No heartbeat from core client for 30 sec - exiting
09:23:30 (4312): No heartbeat from core client for 30 sec - exiting
09:23:31 (4312): No heartbeat from core client for 30 sec - exiting
09:23:32 (4312): No heartbeat from core client for 30 sec - exiting
09:23:33 (4312): No heartbeat from core client for 30 sec - exiting
09:23:34 (4312): No heartbeat from core client for 30 sec - exiting
09:23:36 (4312): No heartbeat from core client for 30 sec - exiting
09:23:37 (4312): No heartbeat from core client for 30 sec - exiting
09:23:38 (4312): No heartbeat from core client for 30 sec - exiting
09:23:39 (4312): No heartbeat from core client for 30 sec - exiting
09:23:40 (4312): No heartbeat from core client for 30 sec - exiting
09:23:41 (4312): No heartbeat from core client for 30 sec - exiting
09:23:42 (4312): No heartbeat from core client for 30 sec - exiting
09:23:43 (4312): No heartbeat from core client for 30 sec - exiting
09:23:44 (4312): No heartbeat from core client for 30 sec - exiting
09:23:45 (4312): No heartbeat from core client for 30 sec - exiting
09:23:46 (4312): No heartbeat from core client for 30 sec - exiting
09:23:48 (4312): No heartbeat from core client for 30 sec - exiting
09:23:49 (4312): No heartbeat from core client for 30 sec - exiting
09:23:50 (4312): No heartbeat from core client for 30 sec - exiting
09:23:51 (4312): No heartbeat from core client for 30 sec - exiting
09:23:52 (4312): No heartbeat from core client for 30 sec - exiting
09:23:53 (4312): No heartbeat from core client for 30 sec - exiting
09:23:54 (4312): No heartbeat from core client for 30 sec - exiting
09:23:55 (4312): No heartbeat from core client for 30 sec - exiting
09:23:56 (4312): No heartbeat from core client for 30 sec - exiting
09:23:57 (4312): No heartbeat from core client for 30 sec - exiting
09:23:58 (4312): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3976, selfPID=4368, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 1
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8000, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7536, selfPID=6036, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6952, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7000, selfPID=5400, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 7
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7360, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5536, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 8
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4736, selfPID=5436, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 9
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7124, selfPID=5724, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7588, selfPID=4484, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3756, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4016, selfPID=5272, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 10
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_pnw_32ie_1984_1_007183406/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_pnw_32ie_1984_1_007183406/dataout/region_restart.day after 11 attempts

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakg.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 0
19:20:58 (4476): called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_pnw_32ie_1984_1_007183406_1_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_32ie_1984_1_007183406_1_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
10 Mar 2011 12:34:27 1109813 12641184 hadam3p_pnw_32ie_1984_1_007183406_1 115,296 268,460 2.3284
09 Mar 2011 14:07:12 1109813 12641184 hadam3p_pnw_32ie_1984_1_007183406_1 103,776 241,612 2.3282
08 Mar 2011 18:00:23 1109813 12641184 hadam3p_pnw_32ie_1984_1_007183406_1 92,256 215,164 2.3322
08 Mar 2011 11:59:38 1109813 12641184 hadam3p_pnw_32ie_1984_1_007183406_1 80,736 188,077 2.3295
08 Mar 2011 11:59:38 1109813 12641184 hadam3p_pnw_32ie_1984_1_007183406_1 69,216 161,711 2.3363
08 Mar 2011 11:59:38 1109813 12641184 hadam3p_pnw_32ie_1984_1_007183406_1 57,696 134,383 2.3292
08 Mar 2011 11:59:38 1109813 12641184 hadam3p_pnw_32ie_1984_1_007183406_1 46,176 107,928 2.3373
08 Mar 2011 11:59:38 1109813 12641184 hadam3p_pnw_32ie_1984_1_007183406_1 34,656 82,530 2.3814
08 Mar 2011 11:59:38 1109813 12641184 hadam3p_pnw_32ie_1984_1_007183406_1 23,136 55,240 2.3876
08 Mar 2011 11:59:38 1109813 12641184 hadam3p_pnw_32ie_1984_1_007183406_1 11,616 27,934 2.4048


©2024 climateprediction.net