Task 12388220

Name	famous_s3l1_999_200_006720883_6
Workunit	6924134
Created	16 Dec 2010, 0:21:11 UTC
Sent	16 Dec 2010, 0:22:07 UTC
Report deadline	17 Mar 2011, 7:49:18 UTC
Received	23 Dec 2010, 12:52:39 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	25 (0x00000019) Unknown error code
Computer ID	1117202
Run time	4 days 7 hours 38 min 5 sec
CPU time	4 days 6 hours 11 min 31 sec
Validate state	Invalid
Credit	2,779.43
Device peak FLOPS	2.79 GFLOPS
Application version	UK Met Office FAMOUS v6.11 windows_intelx86
Stderr	<core_client_version>6.10.56</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> Signal 11 received, exiting... 01:47:35 (3684): called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1456, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... 08:44:20 (4316): called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3536, iMonCtr=1 Model crash detected, will try to restart... Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4216, selfPID=4216, iMonCtr=1 Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 Model crash detected, will try to restart... 15:06:21 (2808): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:09:25 (3368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:16:35 (4664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 15:18:09 (5060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Suspended CPDN Monitor - Suspend request from BOINC... Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy Signal 11 received, exiting... 09:17:50 (4380): called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... 11:37:38 (860): called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... 16:00:40 (1192): called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Signal 11 received, exiting... 19:59:52 (4804): called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... 20:15:02 (3504): called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3244, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... 12:30:51 (3532): called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3188, iMonCtr=1 Model crash detected, will try to restart... 22:49:40 (3344): called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
23 Dec 2010 13:06:34	1117202	12388220	famous_s3l1_999_200_006720883_6	842,426	367,048	0.4357
23 Dec 2010 13:06:34	1117202	12388220	famous_s3l1_999_200_006720883_6	833,066	362,953	0.4357
23 Dec 2010 13:06:34	1117202	12388220	famous_s3l1_999_200_006720883_6	823,706	358,869	0.4357
23 Dec 2010 13:06:34	1117202	12388220	famous_s3l1_999_200_006720883_6	814,346	354,774	0.4357
23 Dec 2010 00:51:45	1117202	12388220	famous_s3l1_999_200_006720883_6	804,986	350,693	0.4357
22 Dec 2010 23:42:39	1117202	12388220	famous_s3l1_999_200_006720883_6	795,626	346,602	0.4356
22 Dec 2010 22:34:56	1117202	12388220	famous_s3l1_999_200_006720883_6	786,266	342,529	0.4356
22 Dec 2010 21:20:42	1117202	12388220	famous_s3l1_999_200_006720883_6	776,906	338,451	0.4356
22 Dec 2010 20:08:04	1117202	12388220	famous_s3l1_999_200_006720883_6	767,546	334,370	0.4356
22 Dec 2010 19:47:51	1117202	12388220	famous_s3l1_999_200_006720883_6	758,186	330,286	0.4356
22 Dec 2010 19:42:05	1117202	12388220	famous_s3l1_999_200_006720883_6	748,826	326,210	0.4356
22 Dec 2010 16:36:56	1117202	12388220	famous_s3l1_999_200_006720883_6	739,466	322,134	0.4356
22 Dec 2010 15:29:42	1117202	12388220	famous_s3l1_999_200_006720883_6	730,106	318,057	0.4356
22 Dec 2010 14:19:07	1117202	12388220	famous_s3l1_999_200_006720883_6	720,746	313,967	0.4356
22 Dec 2010 14:00:58	1117202	12388220	famous_s3l1_999_200_006720883_6	711,386	310,057	0.4358
22 Dec 2010 14:00:58	1117202	12388220	famous_s3l1_999_200_006720883_6	702,026	306,226	0.4362
22 Dec 2010 14:00:58	1117202	12388220	famous_s3l1_999_200_006720883_6	692,666	302,398	0.4366
22 Dec 2010 14:00:58	1117202	12388220	famous_s3l1_999_200_006720883_6	683,306	298,510	0.4369
22 Dec 2010 14:00:58	1117202	12388220	famous_s3l1_999_200_006720883_6	673,946	294,404	0.4368
22 Dec 2010 14:00:58	1117202	12388220	famous_s3l1_999_200_006720883_6	664,586	290,306	0.4368