Task 12833436

Name	hadcm3n_p72p_1900_40_007226753_0
Workunit	7424993
Created	26 Apr 2011, 15:38:27 UTC
Sent	26 Apr 2011, 23:44:23 UTC
Report deadline	27 Jul 2011, 7:11:34 UTC
Received	8 May 2011, 4:37:08 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	-2 (0xFFFFFFFE) Unknown error code
Computer ID	821153
Run time	10 days 14 hours 45 min 55 sec
CPU time	9 days 5 hours 54 min 43 sec
Validate state	Invalid
Credit	5,598.72
Device peak FLOPS	2.30 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code -2 (0xfffffffe) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 forrtl: Not enough storage is available to process this command. Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5432, iMonCtr=1 Model crash detected, will try to restart... forrtl: Not enough storage is available to process this command. Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5432, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: error reading file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_p72p_1900_40_007226753/dataout/ocean_restart.day Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5432, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 01:30:09 (4124): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:30:10 (4124): No heartbeat from core client for 30 sec - exiting 01:30:11 (4124): No heartbeat from core client for 30 sec - exiting 01:30:12 (4124): No heartbeat from core client for 30 sec - exiting 01:30:13 (4124): No heartbeat from core client for 30 sec - exiting 01:30:14 (4124): No heartbeat from core client for 30 sec - exiting 01:30:15 (4124): No heartbeat from core client for 30 sec - exiting 01:30:16 (4124): No heartbeat from core client for 30 sec - exiting 01:30:17 (4124): No heartbeat from core client for 30 sec - exiting 01:30:18 (4124): No heartbeat from core client for 30 sec - exiting 01:30:19 (4124): No heartbeat from core client for 30 sec - exiting 01:30:20 (4124): No heartbeat from core client for 30 sec - exiting 01:30:21 (4124): No heartbeat from core client for 30 sec - exiting 01:30:22 (4124): No heartbeat from core client for 30 sec - exiting 01:30:23 (4124): No heartbeat from core client for 30 sec - exiting 01:30:24 (4124): No heartbeat from core client for 30 sec - exiting 01:30:25 (4124): No heartbeat from core client for 30 sec - exiting 01:30:26 (4124): No heartbeat from core client for 30 sec - exiting 01:30:27 (4124): No heartbeat from core client for 30 sec - exiting 01:30:28 (4124): No heartbeat from core client for 30 sec - exiting 01:30:29 (4124): No heartbeat from core client for 30 sec - exiting 01:30:31 (4124): No heartbeat from core client for 30 sec - exiting 01:30:32 (4124): No heartbeat from core client for 30 sec - exiting 01:30:33 (4124): No heartbeat from core client for 30 sec - exiting 01:30:34 (4124): No heartbeat from core client for 30 sec - exiting 01:30:35 (4124): No heartbeat from core client for 30 sec - exiting 01:30:36 (4124): No heartbeat from core client for 30 sec - exiting 01:30:37 (4124): No heartbeat from core client for 30 sec - exiting 01:30:38 (4124): No heartbeat from core client for 30 sec - exiting 01:30:39 (4124): No heartbeat from core client for 30 sec - exiting 01:30:40 (4124): No heartbeat from core client for 30 sec - exiting 01:30:41 (4124): No heartbeat from core client for 30 sec - exiting 01:30:42 (4124): No heartbeat from core client for 30 sec - exiting 01:30:43 (4124): No heartbeat from core client for 30 sec - exiting 01:30:44 (4124): No heartbeat from core client for 30 sec - exiting 01:30:45 (4124): No heartbeat from core client for 30 sec - exiting 01:30:46 (4124): No heartbeat from core client for 30 sec - exiting 01:30:47 (4124): No heartbeat from core client for 30 sec - exiting 01:30:48 (4124): No heartbeat from core client for 30 sec - exiting 01:30:49 (4124): No heartbeat from core client for 30 sec - exiting 01:30:50 (4124): No heartbeat from core client for 30 sec - exiting 01:30:51 (4124): No heartbeat from core client for 30 sec - exiting 01:30:52 (4124): No heartbeat from core client for 30 sec - exiting 01:30:53 (4124): No heartbeat from core client for 30 sec - exiting 01:30:54 (4124): No heartbeat from core client for 30 sec - exiting 01:30:55 (4124): No heartbeat from core client for 30 sec - exiting 01:30:56 (4124): No heartbeat from core client for 30 sec - exiting 01:30:57 (4124): No heartbeat from core client for 30 sec - exiting 01:30:58 (4124): No heartbeat from core client for 30 sec - exiting 01:30:59 (4124): No heartbeat from core client for 30 sec - exiting 01:31:00 (4124): No heartbeat from core client for 30 sec - exiting 01:31:01 (4124): No heartbeat from core client for 30 sec - exiting 01:31:02 (4124): No heartbeat from core client for 30 sec - exiting 01:31:03 (4124): No heartbeat from core client for 30 sec - exiting 01:31:04 (4124): No heartbeat from core client for 30 sec - exiting 01:31:05 (4124): No heartbeat from core client for 30 sec - exiting 01:31:06 (4124): No heartbeat from core client for 30 sec - exiting 01:31:07 (4124): No heartbeat from core client for 30 sec - exiting 01:31:08 (4124): No heartbeat from core client for 30 sec - exiting No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4548, selfPID=4548, iMonCtr=1 Could not launch model process. Last Error=5 Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
07 May 2011 18:27:30	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	466,560	784,491	1.6814
07 May 2011 05:50:41	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	440,640	740,481	1.6805
06 May 2011 17:11:55	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	414,720	696,406	1.6792
06 May 2011 03:56:32	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	388,800	652,263	1.6776
05 May 2011 14:49:56	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	362,880	607,572	1.6743
05 May 2011 00:01:06	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	336,960	562,510	1.6694
04 May 2011 10:53:59	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	311,040	518,315	1.6664
03 May 2011 22:04:07	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	285,120	474,531	1.6643
03 May 2011 08:13:47	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	259,200	471,919	1.8207
02 May 2011 05:06:36	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	233,280	425,400	1.8236
01 May 2011 14:59:31	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	207,360	378,004	1.8229
01 May 2011 01:05:34	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	181,440	330,686	1.8226
30 Apr 2011 10:45:55	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	155,520	283,557	1.8233
29 Apr 2011 20:57:56	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	129,600	236,316	1.8234
29 Apr 2011 07:26:20	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	103,680	189,089	1.8238
28 Apr 2011 17:26:24	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	77,760	141,554	1.8204
28 Apr 2011 03:30:40	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	51,840	94,515	1.8232
27 Apr 2011 13:47:18	821153	12833436	hadcm3n_p72p_1900_40_007226753_0	25,920	47,271	1.8237