climateprediction.net home page
Task 12492371

Task 12492371

Name famous_wnqe_1199_200_007117869_0
Workunit 7316229
Created 16 Jan 2011, 15:29:25 UTC
Sent 18 Jan 2011, 21:23:47 UTC
Report deadline 20 Apr 2011, 4:50:58 UTC
Received 13 Jun 2011, 22:42:41 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS
Computer ID 1103724
Run time 30 days 4 hours 30 min 8 sec
CPU time 18 days 19 hours 1 min 15 sec
Validate state Invalid
Credit 3,675.00
Device peak FLOPS 0.82 GFLOPS
Application version UK Met Office FAMOUS v6.11
windows_intelx86
Stderr
<core_client_version>6.6.28</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
09:51:40 (2168): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3728, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2512, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1
Model crash detected, will try to restart...
09:52:52 (6088): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
CCPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
07:43:32 (1724): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:44:23 (2704): Can't acquire lockfile (32) - waiting 35s
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4972, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5588, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2260, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3648, iMonCtr=1
Model crash detected, will try to restart...
16:13:54 (6124): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1660, iMonCtr=1
Model crash detected, will try to restart...
C09:35:09 (3552): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4456, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4708, iMonCtr=1
Model crash detected, will try to restart...
18:45:26 (2072): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5976, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3408, iMonCtr=1
Model crash detected, will try to restart...
15:35:04 (1576): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2096, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5080, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4420, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5468, iMonCtr=1
Model crash detected, will try to restart...
23:17:28 (5536): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:17:49 (4772): Can't acquire lockfile (32) - waiting 35s
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4664, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3484, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
09:30:11 (3324): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
00:20:44 (6040): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
08:54:25 (2620): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6048, iMonCtr=1
Model crash detected, will try to restart...
06:24:24 (4716): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:24:40 (4716): No heartbeat from core client for 30 sec - exiting

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4540, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
C09:20:12 (5656): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:05:13 (4964): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:11:43 (3856): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:11:44 (3856): No heartbeat from core client for 30 sec - exiting
14:11:45 (3856): No heartbeat from core client for 30 sec - exiting
14:11:46 (3856): No heartbeat from core client for 30 sec - exiting
14:11:47 (3856): No heartbeat from core client for 30 sec - exiting
14:11:48 (3856): No heartbeat from core client for 30 sec - exiting
14:11:49 (3856): No heartbeat from core client for 30 sec - exiting
14:11:50 (3856): No heartbeat from core client for 30 sec - exiting
14:11:51 (3856): No heartbeat from core client for 30 sec - exiting
14:11:52 (3856): No heartbeat from core client for 30 sec - exiting
14:11:53 (3856): No heartbeat from core client for 30 sec - exiting
19:45:52 (1560): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:38:55 (4576): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:56:55 (5624): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4772, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
13 Jun 2011 18:55:41 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,113,866 1,617,457 1.4521
10 Jun 2011 15:27:15 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,104,506 1,593,103 1.4424
08 Jun 2011 06:17:44 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,095,146 1,577,461 1.4404
07 Jun 2011 05:02:26 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,085,786 1,563,135 1.4396
07 Jun 2011 00:28:46 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,076,426 1,549,591 1.4396
06 Jun 2011 17:09:01 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,067,066 1,533,186 1.4368
06 Jun 2011 00:56:30 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,057,706 1,518,398 1.4356
05 Jun 2011 04:09:44 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,048,346 1,498,967 1.4298
03 Jun 2011 17:56:07 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,038,986 1,478,304 1.4228
02 Jun 2011 14:31:46 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,029,626 1,461,753 1.4197
01 Jun 2011 14:21:17 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,020,266 1,441,346 1.4127
31 May 2011 17:13:04 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,010,906 1,422,543 1.4072
28 May 2011 11:25:23 1103724 12492371 famous_wnqe_1199_200_007117869_0 1,001,546 1,403,478 1.4013
26 May 2011 13:36:34 1103724 12492371 famous_wnqe_1199_200_007117869_0 992,186 1,382,632 1.3935
23 May 2011 15:41:55 1103724 12492371 famous_wnqe_1199_200_007117869_0 982,826 1,358,343 1.3821
20 May 2011 13:17:44 1103724 12492371 famous_wnqe_1199_200_007117869_0 973,466 1,335,840 1.3723
16 May 2011 14:46:16 1103724 12492371 famous_wnqe_1199_200_007117869_0 964,106 1,309,072 1.3578
11 May 2011 16:10:41 1103724 12492371 famous_wnqe_1199_200_007117869_0 954,746 1,275,726 1.3362
08 May 2011 02:30:29 1103724 12492371 famous_wnqe_1199_200_007117869_0 945,386 1,258,786 1.3315
05 May 2011 19:19:33 1103724 12492371 famous_wnqe_1199_200_007117869_0 936,026 1,237,986 1.3226


©2024 climateprediction.net