climateprediction.net home page
Task 16672030

Task 16672030

Name hadam3p_eu_f9x4_2013_1_008764652_0
Workunit 8910630
Created 17 Jun 2014, 15:27:00 UTC
Sent 17 Jun 2014, 19:25:58 UTC
Report deadline 31 May 2015, 0:45:58 UTC
Received 14 Aug 2014, 14:40:53 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1317098
Run time 6 days 14 hours 15 min 58 sec
CPU time 5 days 1 hours 35 min 25 sec
Validate state Invalid
Credit 1,194.02
Device peak FLOPS 1.71 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
15:02:56 (5660): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3680, selfPID=3680, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4592, selfPID=2924, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
17:46:54 (3224): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
19:41:29 (4700): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5952, selfPID=6748, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4884, selfPID=1828, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
17:31:11 (3080): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:10:31 (4776): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:34:58 (4140): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:34:59 (4140): No heartbeat from core client for 30 sec - exiting
18:35:00 (4140): No heartbeat from core client for 30 sec - exiting
18:35:01 (4140): No heartbeat from core client for 30 sec - exiting
18:35:02 (4140): No heartbeat from core client for 30 sec - exiting
18:35:03 (4140): No heartbeat from core client for 30 sec - exiting
18:35:04 (4140): No heartbeat from core client for 30 sec - exiting
18:35:05 (4140): No heartbeat from core client for 30 sec - exiting
18:35:06 (4140): No heartbeat from core client for 30 sec - exiting
18:35:07 (4140): No heartbeat from core client for 30 sec - exiting
20:20:15 (5376): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4280, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3876, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
18:02:27 (1616): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:02:28 (1616): No heartbeat from core client for 30 sec - exiting
18:02:29 (1616): No heartbeat from core client for 30 sec - exiting
18:02:30 (1616): No heartbeat from core client for 30 sec - exiting
18:02:31 (1616): No heartbeat from core client for 30 sec - exiting
18:02:32 (1616): No heartbeat from core client for 30 sec - exiting
18:02:33 (1616): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
00:16:01 (3908): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
00:16:02 (3908): No heartbeat from core client for 30 sec - exiting
00:16:03 (3908): No heartbeat from core client for 30 sec - exiting
00:16:05 (3908): No heartbeat from core client for 30 sec - exiting
00:16:06 (3908): No heartbeat from core client for 30 sec - exiting
00:16:07 (3908): No heartbeat from core client for 30 sec - exiting
00:16:08 (3908): No heartbeat from core client for 30 sec - exiting
00:16:09 (3908): No heartbeat from core client for 30 sec - exiting
00:16:10 (3908): No heartbeat from core client for 30 sec - exiting
00:16:11 (3908): No heartbeat from core client for 30 sec - exiting
00:16:12 (3908): No heartbeat from core client for 30 sec - exiting
00:16:13 (3908): No heartbeat from core client for 30 sec - exiting
00:16:14 (3908): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3792, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4212, selfPID=3084, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4832, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4880, selfPID=3472, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4716, selfPID=3216, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1252, selfPID=4304, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1520, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3412, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
02:57:50 (6552): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
04:35:52 (5000): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
04:35:53 (5000): No heartbeat from core client for 30 sec - exiting
04:35:55 (5000): No heartbeat from core client for 30 sec - exiting
04:35:56 (5000): No heartbeat from core client for 30 sec - exiting
04:35:57 (5000): No heartbeat from core client for 30 sec - exiting
05:06:18 (3400): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:43:03 (1320): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7496, selfPID=7496, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4240, selfPID=3008, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=940, selfPID=3092, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
21:50:30 (6148): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
00:08:52 (6768): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
00:14:55 (5772): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
08:00:52 (5992): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4676, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4684, selfPID=436, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
22:05:50 (1148): No heartbeat from core client for 30 sec - exiting
22:05:51 (1148): No heartbeat from core client for 30 sec - exiting
22:05:52 (1148): No heartbeat from core client for 30 sec - exiting
22:05:53 (1148): No heartbeat from core client for 30 sec - exiting
22:05:54 (1148): No heartbeat from core client for 30 sec - exiting
22:05:55 (1148): No heartbeat from core client for 30 sec - exiting
22:05:56 (1148): No heartbeat from core client for 30 sec - exiting
22:05:58 (1148): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4976, selfPID=2968, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4596, selfPID=4832, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4572, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5724, selfPID=3604, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3328, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2232, selfPID=2820, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3764, selfPID=1904, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...

SETPOS: Seek Failed: Invalid argument
SETPOS: Unit 61 to Word Address -198 Failed with Error Code -1

Model crashed: SETPOS: Unit 61 to Word Address -198 Failed with Error Code -1
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_f9x4_2013_1_008764652_0_7.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_f9x4_2013_1_008764652_0_8.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_f9x4_2013_1_008764652_0_9.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_f9x4_2013_1_008764652_0_10.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_f9x4_2013_1_008764652_0_11.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_f9x4_2013_1_008764652_0_12.zip</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
08 Aug 2014 09:12:52 1317098 16672030 hadam3p_eu_f9x4_2013_1_008764652_0 69,216 397,195 5.7385
25 Jul 2014 14:46:10 1317098 16672030 hadam3p_eu_f9x4_2013_1_008764652_0 57,724 336,119 5.8229
25 Jul 2014 14:46:10 1317098 16672030 hadam3p_eu_f9x4_2013_1_008764652_0 57,696 335,409 5.8134
16 Jul 2014 02:39:37 1317098 16672030 hadam3p_eu_f9x4_2013_1_008764652_0 46,176 272,574 5.9029
02 Jul 2014 21:53:58 1317098 16672030 hadam3p_eu_f9x4_2013_1_008764652_0 34,656 193,957 5.5966
25 Jun 2014 20:40:36 1317098 16672030 hadam3p_eu_f9x4_2013_1_008764652_0 23,136 124,338 5.3742
23 Jun 2014 14:13:28 1317098 16672030 hadam3p_eu_f9x4_2013_1_008764652_0 11,616 62,657 5.3940


©2024 climateprediction.net