Task 16044160

Name	hadcm3n_oe5a_1900_40_008473457_0
Workunit	8624296
Created	27 Sep 2013, 10:21:25 UTC
Sent	29 Sep 2013, 0:11:55 UTC
Report deadline	29 Dec 2013, 7:39:06 UTC
Received	26 Oct 2013, 2:53:40 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	521854
Run time	7 days 6 hours 42 min 10 sec
CPU time	2 days 12 hours 55 min 45 sec
Validate state	Invalid
Credit	4,976.64
Device peak FLOPS	3.32 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.2.11</core_client_version> <![CDATA[ <message> The device does not recognise the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 22:34:48 (3792): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 03:53:58 (8256): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8228, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8228, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8228, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8228, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8228, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8228, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
25 Oct 2013 23:37:53	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	414,720	211,656	0.5104
25 Oct 2013 13:15:57	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	388,800	567,440	1.4595
25 Oct 2013 02:25:10	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	362,880	528,862	1.4574
24 Oct 2013 14:18:50	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	336,960	490,048	1.4543
24 Oct 2013 03:11:21	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	311,040	451,250	1.4508
23 Oct 2013 16:07:42	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	285,120	412,497	1.4467
23 Oct 2013 05:10:33	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	259,200	373,894	1.4425
22 Oct 2013 18:20:16	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	233,280	335,269	1.4372
22 Oct 2013 07:32:33	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	207,360	297,073	1.4326
21 Oct 2013 21:43:19	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	181,440	259,022	1.4276
21 Oct 2013 10:49:01	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	155,520	220,968	1.4208
21 Oct 2013 00:10:14	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	129,600	182,833	1.4107
20 Oct 2013 13:32:19	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	103,680	147,167	1.4194
20 Oct 2013 03:36:58	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	77,760	112,661	1.4488
19 Oct 2013 09:00:27	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	51,840	76,145	1.4688
18 Oct 2013 23:21:40	521854	16044160	hadcm3n_oe5a_1900_40_008473457_0	25,920	38,260	1.4761