Task 13114838

Name	hadcm3n_yhgi_1900_40_007355484_0
Workunit	7552914
Created	6 Jul 2011, 14:40:25 UTC
Sent	10 Jul 2011, 8:58:22 UTC
Report deadline	9 Oct 2011, 16:25:33 UTC
Received	1 Aug 2011, 15:38:44 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1009384
Run time	16 days 3 hours 6 min 56 sec
CPU time	16 days 3 hours 6 min 56 sec
Validate state	Invalid
Credit	9,331.20
Device peak FLOPS	2.45 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.2.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 14:03:58 (6036): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... forrtl: The requested operation cannot be performed on a file with a user-mapped section open. Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6932, iMonCtr=1 Model crash detected, will try to restart... 17:03:06 (6932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:03:07 (6932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
31 Jul 2011 23:23:03	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	777,600	1,354,193	1.7415
31 Jul 2011 04:40:37	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	751,680	1,311,463	1.7447
30 Jul 2011 15:40:20	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	725,760	1,268,953	1.7484
30 Jul 2011 02:39:28	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	699,840	1,226,080	1.7519
29 Jul 2011 21:09:41	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	673,920	1,181,155	1.7527
29 Jul 2011 21:09:41	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	648,000	1,135,357	1.7521
29 Jul 2011 21:09:41	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	622,080	1,092,499	1.7562
27 Jul 2011 21:20:31	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	596,160	1,049,308	1.7601
27 Jul 2011 07:55:35	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	570,240	1,007,110	1.7661
26 Jul 2011 19:16:17	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	544,320	965,965	1.7746
26 Jul 2011 15:29:33	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	518,400	909,244	1.7539
26 Jul 2011 15:29:33	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	492,480	855,006	1.7361
26 Jul 2011 15:29:33	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	466,560	800,196	1.7151
26 Jul 2011 15:29:33	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	440,640	744,456	1.6895
26 Jul 2011 15:29:33	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	414,720	693,590	1.6724
26 Jul 2011 15:29:33	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	388,800	648,547	1.6681
26 Jul 2011 15:29:33	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	362,880	606,604	1.6716
26 Jul 2011 15:29:33	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	336,960	563,806	1.6732
26 Jul 2011 15:29:33	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	311,040	522,663	1.6804
25 Jul 2011 18:09:13	1009384	13114838	hadcm3n_yhgi_1900_40_007355484_0	285,120	480,263	1.6844