climateprediction.net home page
Task 15427607

Task 15427607

Name hadcm3n_yjv0_1940_40_008239239_4
Workunit 8394363
Created 5 Nov 2012, 20:48:59 UTC
Sent 5 Nov 2012, 20:49:11 UTC
Report deadline 5 Feb 2013, 4:16:22 UTC
Received 23 Jan 2013, 14:33:26 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 193 (0x000000C1) EXIT_SIGNAL
Computer ID 1236037
Run time 16 days 5 hours 29 min 8 sec
CPU time 15 days 18 hours 32 min 16 sec
Validate state Invalid
Credit 9,331.20
Device peak FLOPS 2.55 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.42</core_client_version>
<![CDATA[
<message>
 - exit code 193 (0xc1)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6080, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6052, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5952, iMonCtr=1
Model crash detected, will try to restart...
Atmos Hold Restart file rename failed on atmos_restart.hold
Atmos Hold Restart file rename failed on atmos_restart.hold
Atmos Hold Restart file rename failed on atmos_restart.hold
Atmos Hold Restart file rename failed on atmos_restart.hold
Atmos Hold Restart file rename failed on atmos_restart.hold
Ocean Restart file copy failed on yjv0ko.dae8470
Ocean Restart file copy failed on yjv0ko.dae8480
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6124, iMonCtr=1
Model crash detected, will try to restart...
09:43:40 (5780): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5908, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1244, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5764, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6072, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5952, iMonCtr=1
Model crash detected, will try to restart...
21:15:12 (5732): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4556, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5800, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5688, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5776, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6064, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5744, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5708, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
16:01:04 (5984): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:45:29 (6004): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5644, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5644, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5492, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6024, iMonCtr=1
Model crash detected, will try to restart...
Ocean Restart file copy failed on yjv0ko.dag65c0
Ocean Restart file copy failed on yjv0ko.dag65d0
Ocean Restart file copy failed on yjv0ko.dag65e0
Ocean Restart file copy failed on yjv0ko.dag65f0
Ocean Restart file copy failed on yjv0ko.dag65g0
Ocean Restart file copy failed on yjv0ko.dag65h0
Ocean Restart file copy failed on yjv0ko.dag65i0
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5804, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5908, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6028, iMonCtr=1
Model crash detected, will try to restart...
10:02:56 (6072): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:15:26 (7108): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6372, iMonCtr=1
Model crash detected, will try to restart...
Signal 11 received, exiting...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
22 Jan 2013 14:35:33 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 777,600 1,362,730 1.7525
18 Jan 2013 18:35:14 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 751,680 1,315,758 1.7504
17 Jan 2013 16:45:38 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 725,760 1,268,315 1.7476
15 Jan 2013 20:27:22 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 699,840 1,221,156 1.7449
14 Jan 2013 18:47:37 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 673,920 1,175,412 1.7441
11 Jan 2013 17:16:06 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 648,000 1,129,046 1.7424
04 Jan 2013 22:36:14 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 622,080 1,083,109 1.7411
03 Jan 2013 16:46:09 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 596,160 1,038,003 1.7411
31 Dec 2012 19:51:22 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 570,240 992,830 1.7411
27 Dec 2012 21:46:35 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 544,320 947,934 1.7415
21 Dec 2012 17:57:24 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 518,400 902,588 1.7411
19 Dec 2012 20:21:55 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 492,480 857,293 1.7408
17 Dec 2012 20:23:28 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 466,560 812,009 1.7404
13 Dec 2012 22:07:22 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 440,640 766,808 1.7402
13 Dec 2012 17:46:23 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 414,720 721,537 1.7398
07 Dec 2012 19:51:57 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 388,800 676,207 1.7392
04 Dec 2012 20:34:45 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 362,880 631,245 1.7395
03 Dec 2012 18:35:28 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 336,960 585,984 1.7390
29 Nov 2012 22:14:23 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 311,040 540,644 1.7382
28 Nov 2012 18:56:15 1236037 15427607 hadcm3n_yjv0_1940_40_008239239_4 285,120 495,528 1.7380


©2024 climateprediction.net