Name | hadcm3n_yfg2_1940_40_007414640_0 |
Workunit | 7612270 |
Created | 17 Aug 2011, 14:22:12 UTC |
Sent | 17 Aug 2011, 16:26:53 UTC |
Report deadline | 16 Nov 2011, 23:54:04 UTC |
Received | 26 Sep 2011, 21:25:06 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1051474 |
Run time | 21 days 9 hours 10 min |
CPU time | 19 days 0 hours 33 min 49 sec |
Validate state | Invalid |
Credit | 10,264.32 |
Device peak FLOPS | 2.69 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.12.34</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:41:15 (33712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:40:38 (17684): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:45:15 (6676): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:06:51 (7696): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:40:17 (3808): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:07:41 (3556): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4940, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4940, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4940, iMonCtr=1 Model crash detected, will try to restart... 11:47:31 (4940): No heartbeat from core client for 30 sec - exiting 11:47:32 (4940): No heartbeat from core client for 30 sec - exiting 11:47:34 (4940): No heartbeat from core client for 30 sec - exiting 11:47:35 (4940): No heartbeat from core client for 30 sec - exiting 11:47:36 (4940): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:47:37 (4940): No heartbeat from core client for 30 sec - exiting 11:58:17 (4152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=500, iMonCtr=1 Model crash detected, will try to restart... 13:18:20 (500): No heartbeat from core client for 30 sec - exiting 13:18:21 (500): No heartbeat from core client for 30 sec - exiting 13:18:22 (500): No heartbeat from core client for 30 sec - exiting 13:18:23 (500): No heartbeat from core client for 30 sec - exiting 13:18:24 (500): No heartbeat from core client for 30 sec - exiting 13:18:25 (500): No heartbeat from core client for 30 sec - exiting 13:18:26 (500): No heartbeat from core client for 30 sec - exiting 13:18:27 (500): No heartbeat from core client for 30 sec - exiting 13:18:29 (500): No heartbeat from core client for 30 sec - exiting 13:18:30 (500): No heartbeat from core client for 30 sec - exiting 13:18:31 (500): No heartbeat from core client for 30 sec - exiting 13:18:32 (500): No heartbeat from core client for 30 sec - exiting 13:18:33 (500): No heartbeat from core client for 30 sec - exiting 13:18:34 (500): No heartbeat from core client for 30 sec - exiting 13:18:35 (500): No heartbeat from core client for 30 sec - exiting 13:18:36 (500): No heartbeat from core client for 30 sec - exiting 13:18:37 (500): No heartbeat from core client for 30 sec - exiting 13:18:38 (500): No heartbeat from core client for 30 sec - exiting 13:18:39 (500): No heartbeat from core client for 30 sec - exiting 13:18:41 (500): No heartbeat from core client for 30 sec - exiting 13:18:42 (500): No heartbeat from core client for 30 sec - exiting 13:18:43 (500): No heartbeat from core client for 30 sec - exiting 13:18:44 (500): No heartbeat from core client for 30 sec - exiting 13:18:45 (500): No heartbeat from core client for 30 sec - exiting 13:18:46 (500): No heartbeat from core client for 30 sec - exiting 13:18:47 (500): No heartbeat from core client for 30 sec - exiting 13:18:48 (500): No heartbeat from core client for 30 sec - exiting 13:18:49 (500): No heartbeat from core client for 30 sec - exiting 13:18:50 (500): No heartbeat from core client for 30 sec - exiting 13:18:51 (500): No heartbeat from core client for 30 sec - exiting 13:18:53 (500): No heartbeat from core client for 30 sec - exiting 13:18:54 (500): No heartbeat from core client for 30 sec - exiting 13:18:55 (500): No heartbeat from core client for 30 sec - exiting 13:18:56 (500): No heartbeat from core client for 30 sec - exiting 13:18:57 (500): No heartbeat from core client for 30 sec - exiting 13:18:58 (500): No heartbeat from core client for 30 sec - exiting 13:18:59 (500): No heartbeat from core client for 30 sec - exiting 13:19:00 (500): No heartbeat from core client for 30 sec - exiting 13:19:01 (500): No heartbeat from core client for 30 sec - exiting 13:19:02 (500): No heartbeat from core client for 30 sec - exiting 13:19:03 (500): No heartbeat from core client for 30 sec - exiting 13:19:05 (500): No heartbeat from core client for 30 sec - exiting 13:19:06 (500): No heartbeat from core client for 30 sec - exiting 13:19:07 (500): No heartbeat from core client for 30 sec - exiting 13:19:08 (500): No heartbeat from core client for 30 sec - exiting 13:19:09 (500): No heartbeat from core client for 30 sec - exiting 13:19:10 (500): No heartbeat from core client for 30 sec - exiting 13:19:11 (500): No heartbeat from core client for 30 sec - exiting 13:19:12 (500): No heartbeat from core client for 30 sec - exiting 13:19:13 (500): No heartbeat from core client for 30 sec - exiting 13:19:14 (500): No heartbeat from core client for 30 sec - exiting 13:19:15 (500): No heartbeat from core client for 30 sec - exiting 13:19:17 (500): No heartbeat from core client for 30 sec - exiting 13:19:18 (500): No heartbeat from core client for 30 sec - exiting 13:19:19 (500): No heartbeat from core client for 30 sec - exiting 13:19:20 (500): No heartbeat from core client for 30 sec - exiting 13:19:21 (500): No heartbeat from core client for 30 sec - exiting 13:19:22 (500): No heartbeat from core client for 30 sec - exiting 13:19:23 (500): No heartbeat from core client for 30 sec - exiting 13:19:24 (500): No heartbeat from core client for 30 sec - exiting 13:19:25 (500): No heartbeat from core client for 30 sec - exiting 13:19:26 (500): No heartbeat from core client for 30 sec - exiting 13:19:27 (500): No heartbeat from core client for 30 sec - exiting 13:19:29 (500): No heartbeat from core client for 30 sec - exiting 13:19:30 (500): No heartbeat from core client for 30 sec - exiting 13:19:31 (500): No heartbeat from core client for 30 sec - exiting 13:19:32 (500): No heartbeat from core client for 30 sec - exiting 13:19:33 (500): No heartbeat from core client for 30 sec - exiting 13:19:34 (500): No heartbeat from core client for 30 sec - exiting 13:19:35 (500): No heartbeat from core client for 30 sec - exiting 13:19:36 (500): No heartbeat from core client for 30 sec - exiting 13:19:37 (500): No heartbeat from core client for 30 sec - exiting 13:19:38 (500): No heartbeat from core client for 30 sec - exiting 13:19:39 (500): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:27:56 (1400): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4980, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4980, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
25 Sep 2011 14:39:14 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 855,360 | 1,630,075 | 1.9057 |
24 Sep 2011 16:32:17 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 829,440 | 1,575,589 | 1.8996 |
24 Sep 2011 01:11:27 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 803,520 | 1,525,017 | 1.8979 |
22 Sep 2011 18:04:02 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 777,600 | 1,474,564 | 1.8963 |
22 Sep 2011 02:38:44 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 751,680 | 1,423,912 | 1.8943 |
21 Sep 2011 11:53:34 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 725,760 | 1,375,919 | 1.8958 |
20 Sep 2011 20:22:28 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 699,840 | 1,327,027 | 1.8962 |
14 Sep 2011 09:03:10 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 673,920 | 1,281,465 | 1.9015 |
13 Sep 2011 19:10:11 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 648,000 | 1,238,369 | 1.9111 |
07 Sep 2011 14:54:44 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 622,080 | 1,195,780 | 1.9222 |
06 Sep 2011 15:41:10 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 596,160 | 1,148,415 | 1.9264 |
05 Sep 2011 22:59:59 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 570,240 | 1,099,811 | 1.9287 |
05 Sep 2011 08:19:17 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 544,320 | 1,048,870 | 1.9269 |
04 Sep 2011 09:00:22 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 518,400 | 997,205 | 1.9236 |
03 Sep 2011 16:59:27 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 492,480 | 944,624 | 1.9181 |
02 Sep 2011 22:09:25 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 466,560 | 892,194 | 1.9123 |
01 Sep 2011 07:28:42 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 440,640 | 839,642 | 1.9055 |
31 Aug 2011 15:40:27 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 414,720 | 788,997 | 1.9025 |
30 Aug 2011 23:43:45 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 388,800 | 739,363 | 1.9017 |
30 Aug 2011 08:31:26 | 1051474 | 13272207 | hadcm3n_yfg2_1940_40_007414640_0 | 362,880 | 690,550 | 1.9030 |
©2024 climateprediction.net