climateprediction.net home page
Task 16162290

Task 16162290

Name hadcm3n_oft3_1900_40_008475610_1
Workunit 8626449
Created 27 Dec 2013, 20:31:01 UTC
Sent 27 Dec 2013, 20:31:11 UTC
Report deadline 29 Mar 2014, 3:58:22 UTC
Received 21 Jan 2014, 2:28:41 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1185835
Run time 8 days 2 hours 1 min 23 sec
CPU time 7 days 2 hours 23 min 32 sec
Validate state Invalid
Credit 5,909.76
Device peak FLOPS 2.26 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.2.33</core_client_version>
<![CDATA[
<message>
The device does not recognize the command.
 (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
14:32:11 (1092): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
12:29:06 (4584): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:29:07 (4584): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7612, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6136, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5936, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3812, iMonCtr=1
Model crash detected, will try to restart...
06:01:54 (4416): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
11:54:43 (4368): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:54:44 (4368): No heartbeat from core client for 30 sec - exiting
11:55:35 (3056): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
11:59:45 (7892): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:59:46 (7892): No heartbeat from core client for 30 sec - exiting
11:59:48 (7892): No heartbeat from core client for 30 sec - exiting
11:59:49 (7892): No heartbeat from core client for 30 sec - exiting
11:59:50 (7892): No heartbeat from core client for 30 sec - exiting
11:59:51 (7892): No heartbeat from core client for 30 sec - exiting
11:59:52 (7892): No heartbeat from core client for 30 sec - exiting
11:59:53 (7892): No heartbeat from core client for 30 sec - exiting
11:59:54 (7892): No heartbeat from core client for 30 sec - exiting
11:59:55 (7892): No heartbeat from core client for 30 sec - exiting
11:59:56 (7892): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
12:01:58 (6888): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Atmos Hold Restart file rename failed on atmos_restart.hold
12:22:16 (5096): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
12:29:56 (5104): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:29:57 (5104): No heartbeat from core client for 30 sec - exiting
12:32:48 (4436): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
12:35:00 (388): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
12:40:01 (7068): No heartbeat from core client for 30 sec - exiting
12:40:02 (7068): No heartbeat from core client for 30 sec - exiting
12:40:03 (7068): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=944, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
11:08:39 (4324): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
20:34:10 (8556): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:34:12 (8556): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
20:45:19 (6256): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:45:20 (6256): No heartbeat from core client for 30 sec - exiting
20:46:03 (7864): No heartbeat from core client for 30 sec - exiting
20:46:04 (7864): No heartbeat from core client for 30 sec - exiting
20:46:05 (7864): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:46:06 (7864): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2428, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
21:29:01 (6080): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
21:29:03 (6080): No heartbeat from core client for 30 sec - exiting
21:29:04 (6080): No heartbeat from core client for 30 sec - exiting
21:29:05 (6080): No heartbeat from core client for 30 sec - exiting
21:29:06 (6080): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5540, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
20:44:21 (4292): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7352, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
16:32:43 (6484): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
12:31:07 (6456): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
21:37:12 (6632): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:37:13 (6632): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3404, iMonCtr=1
Model crash detected, will try to restart...
07:08:46 (6368): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
12:26:01 (7704): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
12:26:02 (7704): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8472, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4772, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4772, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4772, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4772, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4772, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4772, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
19 Jan 2014 21:37:53 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 492,480 603,054 1.2245
18 Jan 2014 22:32:46 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 466,560 571,041 1.2239
16 Jan 2014 17:17:20 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 440,640 539,013 1.2233
15 Jan 2014 03:27:35 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 414,720 507,952 1.2248
12 Jan 2014 19:54:46 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 388,800 475,925 1.2241
11 Jan 2014 22:36:02 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 362,880 444,182 1.2240
11 Jan 2014 02:19:52 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 336,960 412,994 1.2256
09 Jan 2014 17:36:54 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 311,040 380,486 1.2233
07 Jan 2014 20:11:53 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 285,120 349,228 1.2248
06 Jan 2014 05:57:57 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 259,200 317,381 1.2245
05 Jan 2014 05:34:19 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 233,280 285,963 1.2258
04 Jan 2014 04:48:57 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 207,360 253,938 1.2246
03 Jan 2014 07:50:27 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 181,440 221,494 1.2208
02 Jan 2014 02:18:49 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 155,520 190,742 1.2265
01 Jan 2014 00:57:28 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 129,600 159,642 1.2318
31 Dec 2013 02:43:39 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 103,680 127,871 1.2333
30 Dec 2013 06:36:25 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 77,760 95,664 1.2302
29 Dec 2013 07:02:01 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 51,840 63,380 1.2226
28 Dec 2013 19:14:26 1185835 16162290 hadcm3n_oft3_1900_40_008475610_1 25,920 31,525 1.2162


©2024 climateprediction.net