climateprediction.net home page
Task 12199186

Task 12199186

Name hadam3p_saf_189z_1982_1_006919359_0
Workunit 7122675
Created 22 Nov 2010, 9:45:45 UTC
Sent 18 Mar 2011, 23:18:42 UTC
Report deadline 29 Feb 2012, 4:38:42 UTC
Received 12 Apr 2011, 22:24:33 UTC
Server state Over
Outcome No reply
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1139664
Run time 4 days 3 hours 15 min 13 sec
CPU time 2 days 20 hours 8 min 18 sec
Validate state Invalid
Credit 1,309.70
Device peak FLOPS 2.10 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Southern Africa v6.08
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
16:33:46 (8780): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:33:47 (8780): No heartbeat from core client for 30 sec - exiting
16:33:48 (8780): No heartbeat from core client for 30 sec - exiting
16:33:49 (8780): No heartbeat from core client for 30 sec - exiting
16:33:50 (8780): No heartbeat from core client for 30 sec - exiting
16:33:51 (8780): No heartbeat from core client for 30 sec - exiting
16:33:52 (8780): No heartbeat from core client for 30 sec - exiting
16:33:53 (8780): No heartbeat from core client for 30 sec - exiting
16:33:54 (8780): No heartbeat from core client for 30 sec - exiting
16:33:55 (8780): No heartbeat from core client for 30 sec - exiting
16:33:56 (8780): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
02:33:53 (11784): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=23636, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10876, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3028, selfPID=4416, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10328, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4792, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
00:09:04 (9712): No heartbeat from core client for 30 sec - exiting
00:09:06 (9712): No heartbeat from core client for 30 sec - exiting
00:09:07 (9712): No heartbeat from core client for 30 sec - exiting
00:09:08 (9712): No heartbeat from core client for 30 sec - exiting
00:09:09 (9712): No heartbeat from core client for 30 sec - exiting
00:09:10 (9712): No heartbeat from core client for 30 sec - exiting
00:09:11 (9712): No heartbeat from core client for 30 sec - exiting
00:09:12 (9712): No heartbeat from core client for 30 sec - exiting
00:09:13 (9712): No heartbeat from core client for 30 sec - exiting
00:09:14 (9712): No heartbeat from core client for 30 sec - exiting
00:09:15 (9712): No heartbeat from core client for 30 sec - exiting
00:09:16 (9712): No heartbeat from core client for 30 sec - exiting
00:09:17 (9712): No heartbeat from core client for 30 sec - exiting
00:09:18 (9712): No heartbeat from core client for 30 sec - exiting
00:09:19 (9712): No heartbeat from core client for 30 sec - exiting
00:09:20 (9712): No heartbeat from core client for 30 sec - exiting
00:09:21 (9712): No heartbeat from core client for 30 sec - exiting
00:09:22 (9712): No heartbeat from core client for 30 sec - exiting
00:09:23 (9712): No heartbeat from core client for 30 sec - exiting
00:09:24 (9712): No heartbeat from core client for 30 sec - exiting
00:09:25 (9712): No heartbeat from core client for 30 sec - exiting
00:09:26 (9712): No heartbeat from core client for 30 sec - exiting
00:09:27 (9712): No heartbeat from core client for 30 sec - exiting
00:09:28 (9712): No heartbeat from core client for 30 sec - exiting
00:09:29 (9712): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5908, selfPID=5296, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
17:22:08 (6464): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7260, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6216, selfPID=2972, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
12:35:10 (8168): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
12:35:12 (8168): No heartbeat from core client for 30 sec - exiting
12:35:13 (8168): No heartbeat from core client for 30 sec - exiting
12:35:14 (8168): No heartbeat from core client for 30 sec - exiting
12:35:15 (8168): No heartbeat from core client for 30 sec - exiting
12:35:16 (8168): No heartbeat from core client for 30 sec - exiting
12:35:17 (8168): No heartbeat from core client for 30 sec - exiting
12:35:18 (8168): No heartbeat from core client for 30 sec - exiting
12:35:19 (8168): No heartbeat from core client for 30 sec - exiting
12:35:20 (8168): No heartbeat from core client for 30 sec - exiting
12:35:21 (8168): No heartbeat from core client for 30 sec - exiting
12:35:22 (8168): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
22:55:32 (864): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
22:55:45 (864): No heartbeat from core client for 30 sec - exiting
22:55:48 (864): No heartbeat from core client for 30 sec - exiting
22:55:49 (864): No heartbeat from core client for 30 sec - exiting
22:55:50 (864): No heartbeat from core client for 30 sec - exiting
22:55:51 (864): No heartbeat from core client for 30 sec - exiting
22:55:52 (864): No heartbeat from core client for 30 sec - exiting
22:55:53 (864): No heartbeat from core client for 30 sec - exiting
22:55:54 (864): No heartbeat from core client for 30 sec - exiting
22:55:55 (864): No heartbeat from core client for 30 sec - exiting
22:55:57 (864): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=16064, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
19:22:59 (16064): called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_saf_189z_1982_1_006919359_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_189z_1982_1_006919359_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_189z_1982_1_006919359_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_189z_1982_1_006919359_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_189z_1982_1_006919359_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
04 Apr 2011 06:38:06 1139664 12199186 hadam3p_saf_189z_1982_1_006919359_0 80,736 218,796 2.7100
03 Apr 2011 12:35:31 1139664 12199186 hadam3p_saf_189z_1982_1_006919359_0 69,216 187,501 2.7089
25 Mar 2011 23:58:16 1139664 12199186 hadam3p_saf_189z_1982_1_006919359_0 57,696 157,578 2.7312
23 Mar 2011 12:27:09 1139664 12199186 hadam3p_saf_189z_1982_1_006919359_0 46,176 124,435 2.6948
22 Mar 2011 19:47:58 1139664 12199186 hadam3p_saf_189z_1982_1_006919359_0 34,656 93,344 2.6934
21 Mar 2011 21:55:22 1139664 12199186 hadam3p_saf_189z_1982_1_006919359_0 23,136 62,321 2.6937
20 Mar 2011 20:50:30 1139664 12199186 hadam3p_saf_189z_1982_1_006919359_0 11,616 31,424 2.7052


©2024 climateprediction.net