Task 14778089

Name	hadcm3n_yjwt_1980_40_007999712_2
Workunit	8154826
Created	6 Jun 2012, 12:43:48 UTC
Sent	6 Jun 2012, 16:34:25 UTC
Report deadline	6 Sep 2012, 0:01:36 UTC
Received	8 Jul 2012, 17:26:50 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	255 (0x000000FF) Unknown error code
Computer ID	775427
Run time	19 days 6 hours 26 min 44 sec
CPU time	18 days 2 hours 4 min 28 sec
Validate state	Invalid
Credit	9,331.20
Device peak FLOPS	2.31 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The extended attributes are inconsistent. (0xff) - exit code 255 (0xff) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=98624, iMonCtr=1 Model crash detected, will try to restart... 20:31:11 (5784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:52:37 (36060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3684, iMonCtr=1 Model crash detected, will try to restart... 22:16:48 (6024): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:31:44 (5308): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:40:43 (23060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:19:46 (40100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:54:23 (5944): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 22:51:27 (4168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:09:25 (9412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:33:41 (12220): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8416, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3736, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5136, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 15:45:21 (4512): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6576, iMonCtr=1 Model crash detected, will try to restart... 01:34:19 (5388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:20:04 (9316): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6096, iMonCtr=1 Model crash detected, will try to restart... 20:57:12 (5436): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9388, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 18:06:57 (3616): No heartbeat from core client for 30 sec - exiting 18:06:59 (3616): No heartbeat from core client for 30 sec - exiting 18:07:00 (3616): No heartbeat from core client for 30 sec - exiting 18:07:01 (3616): No heartbeat from core client for 30 sec - exiting 18:07:02 (3616): No heartbeat from core client for 30 sec - exiting 18:07:03 (3616): No heartbeat from core client for 30 sec - exiting 18:07:04 (3616): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... zip error: Could not create output file (was replacing the original zip file) Suspended CPDN Monitor - Suspend request from BOINC... 10:31:08 (3368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:41:32 (5960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 10:52:55 (5128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9112, iMonCtr=1 Model crash detected, will try to restart... 09:38:29 (4800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:38:30 (4800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:43:53 (6396): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 07:26:51 (2608): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3244, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:28:47 (4588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:36:47 (6016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8752, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:35:07 (9344): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9008, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x76F7748F read attempt to address 0x994E992E Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77926E5F read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
07 Jul 2012 19:26:00	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	777,600	1,526,878	1.9636
06 Jul 2012 14:10:11	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	751,680	1,474,961	1.9622
04 Jul 2012 01:45:31	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	725,760	1,425,700	1.9644
03 Jul 2012 02:57:04	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	699,840	1,375,593	1.9656
02 Jul 2012 20:55:44	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	673,920	1,325,851	1.9674
02 Jul 2012 20:55:44	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	648,000	1,275,575	1.9685
30 Jun 2012 08:36:47	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	622,080	1,225,807	1.9705
29 Jun 2012 08:53:06	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	596,160	1,174,503	1.9701
28 Jun 2012 16:42:16	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	570,240	1,122,444	1.9684
27 Jun 2012 16:32:29	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	544,320	1,070,208	1.9661
27 Jun 2012 00:09:40	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	518,400	1,018,197	1.9641
26 Jun 2012 00:15:24	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	492,480	965,842	1.9612
24 Jun 2012 20:07:02	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	466,560	913,545	1.9580
24 Jun 2012 05:05:58	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	440,640	861,804	1.9558
23 Jun 2012 14:29:46	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	414,720	811,662	1.9571
23 Jun 2012 00:08:58	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	388,800	761,566	1.9588
21 Jun 2012 05:18:02	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	362,880	710,693	1.9585
20 Jun 2012 14:09:39	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	336,960	658,920	1.9555
19 Jun 2012 14:27:31	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	311,040	608,524	1.9564
18 Jun 2012 16:51:40	775427	14778089	hadcm3n_yjwt_1980_40_007999712_2	285,120	558,501	1.9588