Task 13287479

Name	hadcm3n_p1xc_1940_40_007419905_0
Workunit	7617540
Created	24 Aug 2011, 20:23:25 UTC
Sent	24 Aug 2011, 20:23:30 UTC
Report deadline	24 Nov 2011, 3:50:41 UTC
Received	15 Dec 2011, 19:17:10 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1137703
Run time	34 days 19 hours 45 min 8 sec
CPU time	20 days 17 hours 50 min 59 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	1.32 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3120, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2144, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4540, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6088, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4684, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4548, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4872, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4784, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5516, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: C I/O Error feof - Unit 62 - Return code = 16 BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/p1xcko.pjf0c10 Error converting file to netcdf: dataout/p1xcko.pif0c10 Error converting file to netcdf: dataout/p1xcko.pff0c10 Error converting file to netcdf: dataout/p1xcko.pcf0c10 Error converting file to netcdf: dataout/p1xcko.pbf0c10 Error converting file to netcdf: dataout/p1xcko.paf0c10 Error converting file to netcdf: dataout/p1xcka.phf0c10 Error converting file to netcdf: dataout/p1xcka.pgf0c10 Error converting file to netcdf: dataout/p1xcka.pef0c10 Error converting file to netcdf: dataout/p1xcka.pdf0c10 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6036, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4952, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3576, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5100, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5736, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5776, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5688, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5848, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5524, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2168, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3600, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4772, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5396, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5944, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
15 Dec 2011 18:20:14	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	518,400	1,792,253	3.4573
03 Dec 2011 19:15:35	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	492,480	1,706,996	3.4661
19 Nov 2011 13:35:05	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	466,560	1,612,834	3.4569
07 Nov 2011 15:37:51	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	440,640	1,523,814	3.4582
05 Nov 2011 16:47:57	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	414,720	1,415,182	3.4124
03 Nov 2011 22:58:49	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	388,800	1,296,483	3.3346
31 Oct 2011 18:48:22	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	362,880	1,191,384	3.2831
31 Oct 2011 15:36:37	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	336,960	1,091,771	3.2401
31 Oct 2011 13:51:19	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	311,040	985,538	3.1685
31 Oct 2011 13:51:19	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	285,120	888,568	3.1165
13 Oct 2011 19:00:47	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	259,200	804,200	3.1026
04 Oct 2011 19:55:56	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	233,280	722,345	3.0965
02 Oct 2011 09:36:21	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	207,360	643,740	3.1045
28 Sep 2011 17:18:00	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	181,440	566,837	3.1241
20 Sep 2011 12:05:22	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	155,520	487,735	3.1362
16 Sep 2011 08:00:31	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	129,600	408,786	3.1542
12 Sep 2011 17:54:35	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	103,680	326,486	3.1490
09 Sep 2011 12:13:17	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	77,760	248,636	3.1975
06 Sep 2011 14:29:54	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	51,840	170,230	3.2838
02 Sep 2011 08:39:46	1137703	13287479	hadcm3n_p1xc_1940_40_007419905_0	25,920	93,526	3.6083