climateprediction.net home page
Task 16166828

Task 16166828

Name hadcm3n_oe4h_1900_40_008473428_2
Workunit 8624267
Created 29 Dec 2013, 11:13:06 UTC
Sent 29 Dec 2013, 11:13:13 UTC
Report deadline 30 Mar 2014, 18:40:24 UTC
Received 31 Jan 2014, 15:19:41 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 11900 (0x00002E7C) Unknown error code
Computer ID 459222
Run time 8 days 15 hours 32 min 46 sec
CPU time 8 days 13 hours 25 min 44 sec
Validate state Invalid
Credit 10,886.40
Device peak FLOPS 3.22 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.2.33</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 11900 (0x2e7c)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
19:04:55 (12096): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:04:56 (12096): No heartbeat from core client for 30 sec - exiting
19:04:57 (12096): No heartbeat from core client for 30 sec - exiting
19:04:58 (12096): No heartbeat from core client for 30 sec - exiting
19:04:59 (12096): No heartbeat from core client for 30 sec - exiting
19:05:00 (12096): No heartbeat from core client for 30 sec - exiting
19:05:01 (12096): No heartbeat from core client for 30 sec - exiting
19:05:02 (12096): No heartbeat from core client for 30 sec - exiting
19:05:03 (12096): No heartbeat from core client for 30 sec - exiting
19:05:04 (12096): No heartbeat from core client for 30 sec - exiting
19:05:05 (12096): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9276, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8240, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8240, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8240, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8964, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8704, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8332, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10348, iMonCtr=1
Model crash detected, will try to restart...
16:11:37 (12140): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
11:00:39 (10364): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:03:06 (10272): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10700, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8424, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8680, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9680, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11284, iMonCtr=1
Model crash detected, will try to restart...
16:44:19 (5756): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7428, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7428, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7428, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8444, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9700, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9900, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
16:03:27 (8268): No heartbeat from core client for 30 sec - exiting
16:03:28 (8268): No heartbeat from core client for 30 sec - exiting
16:03:29 (8268): No heartbeat from core client for 30 sec - exiting
16:03:30 (8268): No heartbeat from core client for 30 sec - exiting
16:03:31 (8268): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
30 Jan 2014 17:43:02 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 907,200 730,263 0.8050
30 Jan 2014 15:41:36 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 881,280 710,085 0.8057
26 Jan 2014 19:24:56 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 855,360 689,400 0.8060
26 Jan 2014 13:42:54 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 829,440 669,187 0.8068
25 Jan 2014 21:59:55 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 803,520 649,108 0.8078
25 Jan 2014 16:42:50 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 777,600 629,239 0.8092
25 Jan 2014 10:22:27 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 751,680 607,651 0.8084
24 Jan 2014 17:18:26 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 725,760 586,070 0.8075
22 Jan 2014 16:35:25 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 699,840 565,328 0.8078
20 Jan 2014 20:07:14 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 673,920 544,692 0.8082
19 Jan 2014 16:55:52 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 648,000 524,366 0.8092
19 Jan 2014 11:10:38 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 622,080 504,630 0.8112
18 Jan 2014 17:36:19 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 596,160 484,347 0.8124
18 Jan 2014 12:18:05 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 570,240 464,027 0.8137
17 Jan 2014 16:47:10 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 544,320 442,453 0.8129
15 Jan 2014 17:44:05 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 518,400 421,169 0.8124
14 Jan 2014 15:28:08 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 492,480 400,746 0.8137
12 Jan 2014 14:50:03 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 466,560 379,428 0.8132
11 Jan 2014 20:45:31 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 440,640 359,145 0.8151
11 Jan 2014 14:44:12 459222 16166828 hadcm3n_oe4h_1900_40_008473428_2 414,720 337,714 0.8143


©2024 climateprediction.net