Task 13591995

Name	hadcm3n_ydt1_1900_40_007526298_2
Workunit	7723773
Created	4 Nov 2011, 3:54:57 UTC
Sent	4 Nov 2011, 4:03:41 UTC
Report deadline	3 Feb 2012, 11:30:52 UTC
Received	28 Jan 2012, 1:30:45 UTC
Server state	Over
Outcome	Computation error
Client state	Aborted by user
Exit status	-197 (0xFFFFFF3B) ERR_ABORTED_VIA_GUI
Computer ID	1102379
Run time	11 days 1 hours 30 min 58 sec
CPU time	10 days 13 hours 50 min 23 sec
Validate state	Invalid
Credit	5,598.72
Device peak FLOPS	2.90 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.60</core_client_version> <![CDATA[ <message> aborted by user </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3564, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3536, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3668, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3568, iMonCtr=1 Model crash detected, will try to restart... 00:49:30 (2412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3816, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3296, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3428, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3540, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3412, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3960, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3884, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4824, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4824, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/ydt1ko.pjb4c10 Error converting file to netcdf: dataout/ydt1ko.pib4c10 Error converting file to netcdf: dataout/ydt1ko.pfb4c10 Error converting file to netcdf: dataout/ydt1ka.phb4c10 Error converting file to netcdf: dataout/ydt1ka.pgb4c10 Error converting file to netcdf: dataout/ydt1ka.peb4c10 Error converting file to netcdf: dataout/ydt1ka.pdb4c10 BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/ydt1ko.pjb4c10 Error converting file to netcdf: dataout/ydt1ko.pib4c10 Error converting file to netcdf: dataout/ydt1ko.pfb4c10 Error converting file to netcdf: dataout/ydt1ka.phb4c10 Error converting file to netcdf: dataout/ydt1ka.pgb4c10 Error converting file to netcdf: dataout/ydt1ka.peb4c10 Error converting file to netcdf: dataout/ydt1ka.pdb4c10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3644, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3560, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4124, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3476, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3356, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4168, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3576, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3760, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3604, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3764, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5492, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3916, iMonCtr=1 Model crash detected, will try to restart... 16:03:47 (1400): No heartbeat from core client for 30 sec - exiting 16:03:48 (1400): No heartbeat from core client for 30 sec - exiting 16:03:49 (1400): No heartbeat from core client for 30 sec - exiting 16:03:50 (1400): No heartbeat from core client for 30 sec - exiting 16:03:51 (1400): No heartbeat from core client for 30 sec - exiting 16:03:52 (1400): No heartbeat from core client for 30 sec - exiting 16:03:53 (1400): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:03:54 (1400): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3128, iMonCtr=1 Model crash detected, will try to restart... 19:11:10 (3444): No heartbeat from core client for 30 sec - exiting 19:11:11 (3444): No heartbeat from core client for 30 sec - exiting 19:11:12 (3444): No heartbeat from core client for 30 sec - exiting 19:11:13 (3444): No heartbeat from core client for 30 sec - exiting 19:11:14 (3444): No heartbeat from core client for 30 sec - exiting 19:11:15 (3444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:33:42 (4316): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4928, iMonCtr=1 Model crash detected, will try to restart... 23:21:39 (4720): No heartbeat from core client for 30 sec - exiting 23:21:40 (4720): No heartbeat from core client for 30 sec - exiting 23:21:41 (4720): No heartbeat from core client for 30 sec - exiting 23:21:42 (4720): No heartbeat from core client for 30 sec - exiting 23:21:43 (4720): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:47:04 (3684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:47:06 (3684): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4568, iMonCtr=1 Model crash detected, will try to restart... 21:01:38 (4468): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:14:02 (3148): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:35:28 (4748): No heartbeat from core client for 30 sec - exiting 08:35:29 (4748): No heartbeat from core client for 30 sec - exiting 08:35:30 (4748): No heartbeat from core client for 30 sec - exiting 08:35:31 (4748): No heartbeat from core client for 30 sec - exiting 08:35:32 (4748): No heartbeat from core client for 30 sec - exiting 08:35:33 (4748): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4300, iMonCtr=1 Model crash detected, will try to restart... 12:21:42 (3104): No heartbeat from core client for 30 sec - exiting 12:21:43 (3104): No heartbeat from core client for 30 sec - exiting 12:21:44 (3104): No heartbeat from core client for 30 sec - exiting 12:21:45 (3104): No heartbeat from core client for 30 sec - exiting 12:21:46 (3104): No heartbeat from core client for 30 sec - exiting 12:21:47 (3104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:21:48 (3104): No heartbeat from core client for 30 sec - exiting 18:33:06 (4060): No heartbeat from core client for 30 sec - exiting 18:33:07 (4060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4488, iMonCtr=1 Model crash detected, will try to restart... 11:12:58 (4832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:14:34 (4584): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:47:24 (3608): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:47:26 (4048): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:47:27 (4048): No heartbeat from core client for 30 sec - exiting 23:03:16 (3676): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 23:07:50 (3520): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C19:43:22 (3084): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:52:08 (4256): No heartbeat from core client for 30 sec - exiting 20:52:09 (4256): No heartbeat from core client for 30 sec - exiting 20:52:10 (4256): No heartbeat from core client for 30 sec - exiting 20:52:12 (4256): No heartbeat from core client for 30 sec - exiting 20:52:13 (4256): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:52:14 (4256): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 23:29:50 (3920): No heartbeat from core client for 30 sec - exiting 23:29:51 (3920): No heartbeat from core client for 30 sec - exiting 23:29:52 (3920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:29:53 (3920): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2576, iMonCtr=1 Model crash detected, will try to restart... 08:17:27 (3224): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:24:29 (5020): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:29:23 (4244): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:29:24 (4244): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5060, iMonCtr=1 Model crash detected, will try to restart... 08:21:00 (4840): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5032, iMonCtr=1 Model crash detected, will try to restart... 17:11:32 (3872): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:11:33 (3872): No heartbeat from core client for 30 sec - exiting 23:03:44 (4208): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:03:45 (4208): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4656, iMonCtr=1 Model crash detected, will try to restart... 08:33:45 (4988): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:33:46 (4988): No heartbeat from core client for 30 sec - exiting 08:33:47 (4988): No heartbeat from core client for 30 sec - exiting 08:33:48 (4988): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5328, iMonCtr=1 Model crash detected, will try to restart... 14:44:42 (4072): No heartbeat from core client for 30 sec - exiting 14:44:43 (4072): No heartbeat from core client for 30 sec - exiting 14:44:44 (4072): No heartbeat from core client for 30 sec - exiting 14:44:45 (4072): No heartbeat from core client for 30 sec - exiting 14:44:46 (4072): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:44:47 (4072): No heartbeat from core client for 30 sec - exiting 22:20:05 (4292): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:20:06 (4292): No heartbeat from core client for 30 sec - exiting 09:33:02 (3564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:29:25 (3896): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4600, iMonCtr=1 Model crash detected, will try to restart... 23:00:27 (3852): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:05:40 (3796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Abort request from BOINC... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
20 Jan 2012 00:27:54	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	466,560	888,301	1.9039
14 Jan 2012 15:13:31	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	440,640	840,940	1.9085
07 Jan 2012 03:41:13	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	414,720	793,207	1.9126
30 Dec 2011 15:12:56	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	388,800	743,745	1.9129
24 Dec 2011 03:33:59	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	362,880	694,971	1.9152
07 Dec 2011 04:19:27	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	336,960	645,363	1.9153
05 Dec 2011 09:06:00	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	311,040	595,886	1.9158
02 Dec 2011 02:30:15	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	285,120	545,352	1.9127
28 Nov 2011 00:53:13	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	259,200	495,627	1.9121
25 Nov 2011 08:05:48	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	233,280	446,351	1.9134
23 Nov 2011 07:02:16	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	207,360	395,105	1.9054
22 Nov 2011 07:08:20	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	181,440	344,279	1.8975
21 Nov 2011 04:32:25	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	155,520	293,773	1.8890
20 Nov 2011 16:04:28	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	129,600	245,075	1.8910
18 Nov 2011 03:52:53	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	103,680	196,288	1.8932
17 Nov 2011 04:52:21	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	77,760	147,898	1.9020
16 Nov 2011 06:08:35	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	51,840	97,041	1.8719
15 Nov 2011 17:30:32	1102379	13591995	hadcm3n_ydt1_1900_40_007526298_2	25,920	48,451	1.8693