Task 16275805

Name	hadcm3n_7c01_1980_40_008426836_3
Workunit	8577692
Created	24 Jan 2014, 20:32:34 UTC
Sent	24 Jan 2014, 20:32:49 UTC
Report deadline	29 Jul 2023, 1:52:49 UTC
Received	17 Jul 2014, 7:41:33 UTC
Server state	Over
Outcome	Success
Client state	Done
Exit status	0 (0x00000000)
Computer ID	1045219
Run time	21 days 17 hours 58 min 26 sec
CPU time	21 days 6 hours 3 min 11 sec
Validate state	Valid
Credit	12,441.60
Device peak FLOPS	2.89 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4392, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3608, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4044, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3968, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=276, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5584, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4540, iMonCtr=1 Model crash detected, will try to restart... 05:09:20 (3844): No heartbeat from core client for 30 sec - exiting 05:09:21 (3844): No heartbeat from core client for 30 sec - exiting 05:09:22 (3844): No heartbeat from core client for 30 sec - exiting 05:09:23 (3844): No heartbeat from core client for 30 sec - exiting 05:09:24 (3844): No heartbeat from core client for 30 sec - exiting 05:09:25 (3844): No heartbeat from core client for 30 sec - exiting 05:09:26 (3844): No heartbeat from core client for 30 sec - exiting 05:09:27 (3844): No heartbeat from core client for 30 sec - exiting 05:09:28 (3844): No heartbeat from core client for 30 sec - exiting 05:09:30 (3844): No heartbeat from core client for 30 sec - exiting 05:09:31 (3844): No heartbeat from core client for 30 sec - exiting 05:09:32 (3844): No heartbeat from core client for 30 sec - exiting 05:09:33 (3844): No heartbeat from core client for 30 sec - exiting 05:09:34 (3844): No heartbeat from core client for 30 sec - exiting 05:09:35 (3844): No heartbeat from core client for 30 sec - exiting 05:09:36 (3844): No heartbeat from core client for 30 sec - exiting 05:09:37 (3844): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=272, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3676, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4708, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3216, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/7c01ko.pjj3c10 Error converting file to netcdf: dataout/7c01ko.pij3c10 Error converting file to netcdf: dataout/7c01ko.pfj3c10 Error converting file to netcdf: dataout/7c01ka.phj3c10 Error converting file to netcdf: dataout/7c01ka.pgj3c10 Error converting file to netcdf: dataout/7c01ka.pej3c10 Error converting file to netcdf: dataout/7c01ka.pdj3c10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4748, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4288, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6024, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4256, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4256, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1948, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1836, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5060, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3240, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2104, iMonCtr=1 Model crash detected, will try to restart... 17:38:42 (3860): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3232, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6428, iMonCtr=1 Model crash detected, will try to restart... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
17 Jul 2014 04:14:34	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	1,036,800	1,835,554	1.7704
13 Jul 2014 08:31:29	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	1,010,880	1,785,075	1.7659
12 Jul 2014 17:35:12	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	984,960	1,734,563	1.7610
06 Jul 2014 09:55:28	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	959,040	1,683,928	1.7558
05 Jul 2014 19:29:05	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	933,120	1,633,247	1.7503
04 Jul 2014 14:22:35	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	907,200	1,585,155	1.7473
29 Jun 2014 10:41:32	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	881,280	1,552,569	1.7617
29 Jun 2014 01:29:34	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	855,360	1,516,305	1.7727
28 Jun 2014 14:36:41	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	829,440	1,480,249	1.7846
24 Jun 2014 16:23:34	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	803,520	1,444,398	1.7976
24 Jun 2014 16:23:34	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	777,600	1,408,485	1.8113
24 Jun 2014 16:23:34	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	751,680	1,374,681	1.8288
21 Jun 2014 16:56:47	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	725,760	1,340,892	1.8476
10 Jun 2014 09:03:37	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	699,840	1,306,540	1.8669
01 Jun 2014 13:54:53	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	673,920	1,266,658	1.8795
28 May 2014 07:51:11	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	648,000	1,216,550	1.8774
25 May 2014 06:01:00	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	622,080	1,165,334	1.8733
24 May 2014 15:31:03	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	596,160	1,113,701	1.8681
11 May 2014 17:06:40	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	570,240	1,062,593	1.8634
04 May 2014 18:05:14	1045219	16275805	hadcm3n_7c01_1980_40_008426836_3	544,320	1,011,591	1.8584