Name | hadcm3n_zmyj_1880_40_008026549_2 |
Workunit | 8181663 |
Created | 18 Jul 2012, 19:36:21 UTC |
Sent | 18 Jul 2012, 19:36:39 UTC |
Report deadline | 18 Oct 2012, 3:03:50 UTC |
Received | 6 Oct 2012, 2:13:39 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1107747 |
Run time | 25 days 21 hours 27 min 38 sec |
CPU time | 25 days 2 hours 46 min 19 sec |
Validate state | Invalid |
Credit | 11,508.48 |
Device peak FLOPS | 2.80 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.12.34</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 03:47:30 (2656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 11:46:26 (5892): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:45:22 (5024): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 03:44:19 (4352): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:43:14 (4320): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold Atmos Restart file copy failed on zmyjka.da903n0 Atmos Hold Restart file rename failed on atmos_restart.hold Atmos Hold Restart file rename failed on atmos_restart.hold Atmos Hold Restart file rename failed on atmos_restart.hold Atmos Hold Restart file rename failed on atmos_restart.hold Atmos Hold Restart file rename failed on atmos_restart.hold Atmos Hold Restart file rename failed on atmos_restart.hold 04:59:36 (4868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 05:05:28 (3188): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 05:32:44 (2908): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 05:43:49 (1784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 21:28:32 (1360): No heartbeat from core client for 30 sec - exiting 21:28:33 (1360): No heartbeat from core client for 30 sec - exiting 21:28:34 (1360): No heartbeat from core client for 30 sec - exiting 21:28:35 (1360): No heartbeat from core client for 30 sec - exiting 21:28:37 (1360): No heartbeat from core client for 30 sec - exiting 21:28:38 (1360): No heartbeat from core client for 30 sec - exiting 21:28:39 (1360): No heartbeat from core client for 30 sec - exiting 21:28:40 (1360): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 02:44:36 (324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:43:28 (200): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:42:19 (2744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:41:11 (5100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:40:02 (2988): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:38:57 (5168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:37:49 (3544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:36:44 (5012): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:35:37 (3564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 05:34:30 (2320): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:33:23 (2328): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:32:16 (4464): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:31:07 (3020): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:30:00 (1192): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:28:52 (588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:27:49 (3164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 02:26:41 (5876): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 05:25:36 (1268): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2976, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2976, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2976, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2976, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2976, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2976, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
02 Oct 2012 20:12:39 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 959,040 | 2,145,986 | 2.2376 |
18 Sep 2012 06:55:05 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 933,120 | 2,086,585 | 2.2361 |
17 Sep 2012 14:36:33 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 907,200 | 2,029,334 | 2.2369 |
16 Sep 2012 22:02:26 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 881,280 | 1,971,370 | 2.2369 |
16 Sep 2012 05:28:44 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 855,360 | 1,912,986 | 2.2365 |
15 Sep 2012 12:53:26 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 829,440 | 1,854,633 | 2.2360 |
14 Sep 2012 20:28:45 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 803,520 | 1,796,734 | 2.2361 |
14 Sep 2012 04:01:20 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 777,600 | 1,738,775 | 2.2361 |
27 Aug 2012 23:48:46 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 751,680 | 1,680,609 | 2.2358 |
23 Aug 2012 10:36:47 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 725,760 | 1,621,851 | 2.2347 |
22 Aug 2012 18:08:29 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 699,840 | 1,563,318 | 2.2338 |
22 Aug 2012 01:09:32 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 673,920 | 1,504,484 | 2.2324 |
21 Aug 2012 08:39:52 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 648,000 | 1,446,710 | 2.2326 |
20 Aug 2012 16:15:26 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 622,080 | 1,388,716 | 2.2324 |
17 Aug 2012 07:24:36 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 596,160 | 1,331,706 | 2.2338 |
10 Aug 2012 22:24:46 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 570,240 | 1,272,238 | 2.2311 |
10 Aug 2012 05:21:13 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 544,320 | 1,212,573 | 2.2277 |
09 Aug 2012 13:02:08 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 518,400 | 1,155,268 | 2.2285 |
08 Aug 2012 20:49:24 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 492,480 | 1,097,838 | 2.2292 |
08 Aug 2012 04:20:55 | 1107747 | 14939168 | hadcm3n_zmyj_1880_40_008026549_2 | 466,560 | 1,039,703 | 2.2284 |
©2024 climateprediction.net