climateprediction.net home page
Task 11010620

Task 11010620

Name hadsm3dhet2_jolz_006594889_2
Workunit 6798262
Created 15 Mar 2010, 12:00:37 UTC
Sent 5 Oct 2010, 18:21:35 UTC
Report deadline 17 Sep 2011, 23:41:35 UTC
Received 11 Nov 2010, 7:32:26 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1068252
Run time 8 days 3 hours 14 min 3 sec
CPU time 4 days 19 hours 47 min 4 sec
Validate state Invalid
Credit 2,778.81
Device peak FLOPS 2.75 GFLOPS
Application version UK Met Office HadSM3 Slab Model v6.07
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3748, iMonCtr=1
Model crash detected, will try to restart...
No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3720, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3824, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4236, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4672, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5876, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4756, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1476, iMonCtr=1
Model crash detected, will try to restart...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5072, iMonCtr=1
Model crash detected, will try to restart...
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
MainError:	07:12:51 PM	No files match the supplied pattern.
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4716, iMonCtr=1
Model crash detected, will try to restart...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
forrtl: Access is denied.

CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4712, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.

CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4712, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.

CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4712, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.

CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4712, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.

CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4712, iMonCtr=1
Model crash detected, will try to restart...
forrtl: Access is denied.

CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4712, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
10 Nov 2010 22:45:25 1068252 11010620 hadsm3dhet2_jolz_006594889_2 43,208 407,585 1.3476
08 Nov 2010 23:20:18 1068252 11010620 hadsm3dhet2_jolz_006594889_2 32,406 393,181 1.3481
08 Nov 2010 06:46:49 1068252 11010620 hadsm3dhet2_jolz_006594889_2 21,604 378,757 1.3486
07 Nov 2010 04:21:41 1068252 11010620 hadsm3dhet2_jolz_006594889_2 10,802 364,450 1.3496
06 Nov 2010 19:17:20 1068252 11010620 hadsm3dhet2_jolz_006594889_2 259,248 350,238 1.3510
06 Nov 2010 04:34:04 1068252 11010620 hadsm3dhet2_jolz_006594889_2 248,446 335,834 1.3517
05 Nov 2010 07:38:59 1068252 11010620 hadsm3dhet2_jolz_006594889_2 237,644 321,732 1.3538
04 Nov 2010 22:57:58 1068252 11010620 hadsm3dhet2_jolz_006594889_2 226,842 306,374 1.3506
04 Nov 2010 04:43:57 1068252 11010620 hadsm3dhet2_jolz_006594889_2 216,040 291,252 1.3481
28 Oct 2010 03:29:04 1068252 11010620 hadsm3dhet2_jolz_006594889_2 205,238 276,626 1.3478
27 Oct 2010 07:09:40 1068252 11010620 hadsm3dhet2_jolz_006594889_2 194,436 262,001 1.3475
26 Oct 2010 18:19:47 1068252 11010620 hadsm3dhet2_jolz_006594889_2 183,634 247,200 1.3462
25 Oct 2010 22:26:27 1068252 11010620 hadsm3dhet2_jolz_006594889_2 172,832 232,833 1.3472
24 Oct 2010 21:57:41 1068252 11010620 hadsm3dhet2_jolz_006594889_2 162,030 218,208 1.3467
22 Oct 2010 21:33:28 1068252 11010620 hadsm3dhet2_jolz_006594889_2 151,228 203,565 1.3461
22 Oct 2010 05:40:01 1068252 11010620 hadsm3dhet2_jolz_006594889_2 140,426 189,246 1.3477
21 Oct 2010 22:13:57 1068252 11010620 hadsm3dhet2_jolz_006594889_2 129,624 174,798 1.3485
21 Oct 2010 03:05:49 1068252 11010620 hadsm3dhet2_jolz_006594889_2 118,822 159,985 1.3464
20 Oct 2010 19:19:13 1068252 11010620 hadsm3dhet2_jolz_006594889_2 108,020 145,508 1.3470
20 Oct 2010 03:42:21 1068252 11010620 hadsm3dhet2_jolz_006594889_2 97,218 130,975 1.3472


©2024 climateprediction.net