Name | hadcm3n_8dsf_1980_40_008727290_2 |
Workunit | 8873268 |
Created | 12 May 2014, 22:57:53 UTC |
Sent | 12 May 2014, 23:40:55 UTC |
Report deadline | 12 Aug 2014, 7:08:06 UTC |
Received | 4 Jun 2014, 21:01:18 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1299063 |
Run time | 20 days 17 hours 50 min 55 sec |
CPU time | 18 days 20 hours 57 min 5 sec |
Validate state | Invalid |
Credit | 11,508.48 |
Device peak FLOPS | 2.23 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.64</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 08:08:32 (11812): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 20:50:55 (10228): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:39:03 (9544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:55:02 (9128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 10:28:05 (11020): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:07:52 (1728): No heartbeat from core client for 30 sec - exiting 16:07:53 (1728): No heartbeat from core client for 30 sec - exiting 16:07:54 (1728): No heartbeat from core client for 30 sec - exiting 16:07:55 (1728): No heartbeat from core client for 30 sec - exiting 16:07:56 (1728): No heartbeat from core client for 30 sec - exiting 16:07:57 (1728): No heartbeat from core client for 30 sec - exiting 16:07:58 (1728): No heartbeat from core client for 30 sec - exiting 16:07:59 (1728): No heartbeat from core client for 30 sec - exiting 16:08:00 (1728): No heartbeat from core client for 30 sec - exiting 16:08:01 (1728): No heartbeat from core client for 30 sec - exiting 16:08:02 (1728): No heartbeat from core client for 30 sec - exiting 16:08:03 (1728): No heartbeat from core client for 30 sec - exiting 16:08:04 (1728): No heartbeat from core client for 30 sec - exiting 16:08:05 (1728): No heartbeat from core client for 30 sec - exiting 16:08:06 (1728): No heartbeat from core client for 30 sec - exiting 16:08:07 (1728): No heartbeat from core client for 30 sec - exiting 16:08:08 (1728): No heartbeat from core client for 30 sec - exiting 16:08:09 (1728): No heartbeat from core client for 30 sec - exiting 16:08:10 (1728): No heartbeat from core client for 30 sec - exiting 16:08:11 (1728): No heartbeat from core client for 30 sec - exiting 16:08:12 (1728): No heartbeat from core client for 30 sec - exiting 16:08:13 (1728): No heartbeat from core client for 30 sec - exiting 16:08:14 (1728): No heartbeat from core client for 30 sec - exiting 16:08:15 (1728): No heartbeat from core client for 30 sec - exiting 16:08:16 (1728): No heartbeat from core client for 30 sec - exiting 16:08:17 (1728): No heartbeat from core client for 30 sec - exiting 16:08:18 (1728): No heartbeat from core client for 30 sec - exiting 16:08:19 (1728): No heartbeat from core client for 30 sec - exiting 16:08:20 (1728): No heartbeat from core client for 30 sec - exiting 16:08:21 (1728): No heartbeat from core client for 30 sec - exiting 16:08:22 (1728): No heartbeat from core client for 30 sec - exiting 16:08:23 (1728): No heartbeat from core client for 30 sec - exiting 16:08:24 (1728): No heartbeat from core client for 30 sec - exiting 16:08:25 (1728): No heartbeat from core client for 30 sec - exiting 16:08:26 (1728): No heartbeat from core client for 30 sec - exiting 16:08:27 (1728): No heartbeat from core client for 30 sec - exiting 16:08:28 (1728): No heartbeat from core client for 30 sec - exiting 16:08:29 (1728): No heartbeat from core client for 30 sec - exiting 16:08:30 (1728): No heartbeat from core client for 30 sec - exiting 16:08:31 (1728): No heartbeat from core client for 30 sec - exiting 16:08:32 (1728): No heartbeat from core client for 30 sec - exiting 16:08:33 (1728): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:11:11 (4172): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:11:42 (9304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:15:54 (11776): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:15:55 (11776): No heartbeat from core client for 30 sec - exiting 16:15:56 (11776): No heartbeat from core client for 30 sec - exiting 16:15:57 (11776): No heartbeat from core client for 30 sec - exiting 16:15:58 (11776): No heartbeat from core client for 30 sec - exiting 16:15:59 (11776): No heartbeat from core client for 30 sec - exiting 16:16:00 (11776): No heartbeat from core client for 30 sec - exiting 16:16:01 (11776): No heartbeat from core client for 30 sec - exiting 16:16:02 (11776): No heartbeat from core client for 30 sec - exiting 16:16:03 (11776): No heartbeat from core client for 30 sec - exiting 16:16:04 (11776): No heartbeat from core client for 30 sec - exiting 16:17:17 (5844): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 11:49:21 (10700): No heartbeat from core client for 30 sec - exiting 11:49:22 (10700): No heartbeat from core client for 30 sec - exiting 11:49:23 (10700): No heartbeat from core client for 30 sec - exiting 11:49:24 (10700): No heartbeat from core client for 30 sec - exiting 11:49:25 (10700): No heartbeat from core client for 30 sec - exiting 11:49:26 (10700): No heartbeat from core client for 30 sec - exiting 11:49:27 (10700): No heartbeat from core client for 30 sec - exiting 11:49:28 (10700): No heartbeat from core client for 30 sec - exiting 11:49:29 (10700): No heartbeat from core client for 30 sec - exiting 11:49:30 (10700): No heartbeat from core client for 30 sec - exiting 11:49:31 (10700): No heartbeat from core client for 30 sec - exiting 11:49:32 (10700): No heartbeat from core client for 30 sec - exiting 11:49:33 (10700): No heartbeat from core client for 30 sec - exiting 11:49:34 (10700): No heartbeat from core client for 30 sec - exiting 11:49:35 (10700): No heartbeat from core client for 30 sec - exiting 11:49:36 (10700): No heartbeat from core client for 30 sec - exiting 11:49:37 (10700): No heartbeat from core client for 30 sec - exiting 11:49:38 (10700): No heartbeat from core client for 30 sec - exiting 11:49:39 (10700): No heartbeat from core client for 30 sec - exiting 11:49:40 (10700): No heartbeat from core client for 30 sec - exiting 11:49:41 (10700): No heartbeat from core client for 30 sec - exiting 11:49:42 (10700): No heartbeat from core client for 30 sec - exiting 11:49:43 (10700): No heartbeat from core client for 30 sec - exiting 11:49:44 (10700): No heartbeat from core client for 30 sec - exiting 11:49:45 (10700): No heartbeat from core client for 30 sec - exiting 11:49:46 (10700): No heartbeat from core client for 30 sec - exiting 11:49:47 (10700): No heartbeat from core client for 30 sec - exiting 11:49:48 (10700): No heartbeat from core client for 30 sec - exiting 11:49:49 (10700): No heartbeat from core client for 30 sec - exiting 11:49:50 (10700): No heartbeat from core client for 30 sec - exiting 11:49:51 (10700): No heartbeat from core client for 30 sec - exiting 11:49:52 (10700): No heartbeat from core client for 30 sec - exiting 11:49:53 (10700): No heartbeat from core client for 30 sec - exiting 11:49:54 (10700): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:52:15 (11080): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 11:55:31 (6124): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 12:55:35 (6968): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:55:36 (6968): No heartbeat from core client for 30 sec - exiting 12:55:37 (6968): No heartbeat from core client for 30 sec - exiting 12:55:38 (6968): No heartbeat from core client for 30 sec - exiting 12:55:39 (6968): No heartbeat from core client for 30 sec - exiting 12:55:40 (6968): No heartbeat from core client for 30 sec - exiting 12:55:41 (6968): No heartbeat from core client for 30 sec - exiting 12:55:42 (6968): No heartbeat from core client for 30 sec - exiting 12:55:43 (6968): No heartbeat from core client for 30 sec - exiting 12:55:44 (6968): No heartbeat from core client for 30 sec - exiting 12:55:45 (6968): No heartbeat from core client for 30 sec - exiting 13:00:03 (12748): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:00:04 (12748): No heartbeat from core client for 30 sec - exiting 13:00:05 (12748): No heartbeat from core client for 30 sec - exiting 13:00:06 (12748): No heartbeat from core client for 30 sec - exiting 13:00:07 (12748): No heartbeat from core client for 30 sec - exiting 13:00:08 (12748): No heartbeat from core client for 30 sec - exiting 13:00:09 (12748): No heartbeat from core client for 30 sec - exiting 13:00:10 (12748): No heartbeat from core client for 30 sec - exiting 13:00:11 (12748): No heartbeat from core client for 30 sec - exiting 13:00:12 (12748): No heartbeat from core client for 30 sec - exiting 13:00:13 (12748): No heartbeat from core client for 30 sec - exiting 14:13:34 (9240): No heartbeat from core client for 30 sec - exiting 14:13:36 (9240): No heartbeat from core client for 30 sec - exiting 14:13:37 (9240): No heartbeat from core client for 30 sec - exiting 14:13:38 (9240): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:19:15 (11388): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 15:17:28 (2128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:21:55 (11112): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:21:56 (11112): No heartbeat from core client for 30 sec - exiting 15:21:57 (11112): No heartbeat from core client for 30 sec - exiting 15:21:58 (11112): No heartbeat from core client for 30 sec - exiting 15:21:59 (11112): No heartbeat from core client for 30 sec - exiting 15:22:00 (11112): No heartbeat from core client for 30 sec - exiting 15:22:01 (11112): No heartbeat from core client for 30 sec - exiting 15:22:02 (11112): No heartbeat from core client for 30 sec - exiting 15:22:03 (11112): No heartbeat from core client for 30 sec - exiting 15:22:04 (11112): No heartbeat from core client for 30 sec - exiting 15:22:05 (11112): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:43:36 (3052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:40:27 (11600): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:33:55 (9752): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:39:41 (9468): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:40:32 (9640): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:41:24 (11540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 08:44:25 (6104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:45:29 (8764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 08:47:58 (10708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:51:03 (11640): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:56:00 (12208): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:56:01 (12208): No heartbeat from core client for 30 sec - exiting 08:56:02 (12208): No heartbeat from core client for 30 sec - exiting 08:56:03 (12208): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 09:12:43 (13116): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 10:12:39 (8336): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:55:08 (10884): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10992, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10992, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10992, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10992, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10992, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7212, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
04 Jun 2014 01:54:45 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 959,040 | 1,594,055 | 1.6621 |
03 Jun 2014 10:37:01 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 933,120 | 1,550,330 | 1.6614 |
02 Jun 2014 20:24:09 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 907,200 | 1,507,818 | 1.6621 |
02 Jun 2014 05:11:33 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 881,280 | 1,465,335 | 1.6627 |
01 Jun 2014 16:27:23 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 855,360 | 1,422,750 | 1.6633 |
01 Jun 2014 03:36:50 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 829,440 | 1,380,091 | 1.6639 |
31 May 2014 14:11:26 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 803,520 | 1,337,240 | 1.6642 |
31 May 2014 01:04:48 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 777,600 | 1,294,552 | 1.6648 |
30 May 2014 10:22:23 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 751,680 | 1,251,947 | 1.6655 |
29 May 2014 17:28:23 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 725,760 | 1,209,273 | 1.6662 |
28 May 2014 22:00:34 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 699,840 | 1,166,239 | 1.6664 |
28 May 2014 02:50:18 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 673,920 | 1,122,346 | 1.6654 |
27 May 2014 11:31:38 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 648,000 | 1,077,230 | 1.6624 |
26 May 2014 21:45:21 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 622,080 | 1,034,157 | 1.6624 |
26 May 2014 07:52:51 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 596,160 | 990,939 | 1.6622 |
25 May 2014 18:54:58 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 570,240 | 947,672 | 1.6619 |
25 May 2014 04:05:39 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 544,320 | 904,355 | 1.6614 |
24 May 2014 14:06:42 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 518,400 | 861,039 | 1.6610 |
24 May 2014 01:14:56 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 492,480 | 817,857 | 1.6607 |
23 May 2014 11:10:56 | 1299063 | 16636861 | hadcm3n_8dsf_1980_40_008727290_2 | 466,560 | 774,664 | 1.6604 |
©2024 climateprediction.net