Name | hadcm3n_8ajw_1980_40_008723095_0 |
Workunit | 8869073 |
Created | 23 Apr 2014, 12:56:09 UTC |
Sent | 4 May 2014, 0:56:45 UTC |
Report deadline | 3 Aug 2014, 8:23:56 UTC |
Received | 21 Feb 2015, 8:50:18 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1244751 |
Run time | 24 days 18 hours 4 min 17 sec |
CPU time | 23 days 20 hours 39 min 59 sec |
Validate state | Invalid |
Credit | 10,886.40 |
Device peak FLOPS | 2.23 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> Enheden genkender ikke kommandoen. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=80076, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 15:18:50 (10036): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:02:22 (4868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:51:17 (3280): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 02:50:17 (135720): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2700, iMonCtr=1 Model crash detected, will try to restart... 08:18:19 (5412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:16:11 (5208): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:12:38 (3520): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:11:35 (976): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:09:30 (3948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4768, iMonCtr=1 Model crash detected, will try to restart... 22:15:33 (6100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:14:33 (19508): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:49:52 (3864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 06:48:50 (21620): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:47:44 (34864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:44:30 (52180): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:43:30 (98492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5400, iMonCtr=1 Model crash detected, will try to restart... 12:16:06 (5068): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:15:06 (20384): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:13:47 (33816): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3776, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3776, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3776, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3776, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3776, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3776, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
15 Feb 2015 17:15:18 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 907,200 | 2,047,471 | 2.2569 |
15 Feb 2015 00:07:02 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 881,280 | 1,986,475 | 2.2541 |
14 Feb 2015 07:47:49 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 855,360 | 1,928,689 | 2.2548 |
13 Feb 2015 13:15:36 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 829,440 | 1,864,771 | 2.2482 |
12 Feb 2015 18:45:00 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 803,520 | 1,801,248 | 2.2417 |
12 Feb 2015 00:16:51 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 777,600 | 1,738,213 | 2.2354 |
11 Feb 2015 07:35:49 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 751,680 | 1,679,844 | 2.2348 |
10 Feb 2015 15:33:42 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 725,760 | 1,623,079 | 2.2364 |
09 Feb 2015 23:22:08 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 699,840 | 1,565,659 | 2.2372 |
09 Feb 2015 06:57:39 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 673,920 | 1,507,671 | 2.2372 |
08 Feb 2015 14:45:24 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 648,000 | 1,450,161 | 2.2379 |
17 Dec 2014 08:41:54 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 622,080 | 1,389,773 | 2.2341 |
16 Dec 2014 11:05:47 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 596,160 | 1,330,259 | 2.2314 |
15 Dec 2014 17:40:57 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 570,240 | 1,269,173 | 2.2257 |
15 Dec 2014 00:57:14 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 544,320 | 1,209,328 | 2.2217 |
14 Dec 2014 08:04:39 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 518,400 | 1,149,212 | 2.2168 |
13 Dec 2014 14:33:58 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 492,480 | 1,088,387 | 2.2100 |
12 Dec 2014 21:37:02 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 466,560 | 1,027,732 | 2.2028 |
19 Oct 2014 09:00:34 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 440,640 | 969,620 | 2.2005 |
18 Oct 2014 16:38:24 | 1244751 | 16587913 | hadcm3n_8ajw_1980_40_008723095_0 | 414,720 | 912,664 | 2.2007 |
©2024 climateprediction.net