Name | hadcm3n_yjwt_1980_40_007999712_2 |
Workunit | 8154826 |
Created | 6 Jun 2012, 12:43:48 UTC |
Sent | 6 Jun 2012, 16:34:25 UTC |
Report deadline | 6 Sep 2012, 0:01:36 UTC |
Received | 8 Jul 2012, 17:26:50 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 255 (0x000000FF) Unknown error code |
Computer ID | 775427 |
Run time | 19 days 6 hours 26 min 44 sec |
CPU time | 18 days 2 hours 4 min 28 sec |
Validate state | Invalid |
Credit | 9,331.20 |
Device peak FLOPS | 2.31 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The extended attributes are inconsistent. (0xff) - exit code 255 (0xff) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=98624, iMonCtr=1 Model crash detected, will try to restart... 20:31:11 (5784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:52:37 (36060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3684, iMonCtr=1 Model crash detected, will try to restart... 22:16:48 (6024): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:31:44 (5308): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:40:43 (23060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:19:46 (40100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:54:23 (5944): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 22:51:27 (4168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:09:25 (9412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:33:41 (12220): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8416, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3736, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5136, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 15:45:21 (4512): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6576, iMonCtr=1 Model crash detected, will try to restart... 01:34:19 (5388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:20:04 (9316): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6096, iMonCtr=1 Model crash detected, will try to restart... 20:57:12 (5436): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9388, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 18:06:57 (3616): No heartbeat from core client for 30 sec - exiting 18:06:59 (3616): No heartbeat from core client for 30 sec - exiting 18:07:00 (3616): No heartbeat from core client for 30 sec - exiting 18:07:01 (3616): No heartbeat from core client for 30 sec - exiting 18:07:02 (3616): No heartbeat from core client for 30 sec - exiting 18:07:03 (3616): No heartbeat from core client for 30 sec - exiting 18:07:04 (3616): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... zip error: Could not create output file (was replacing the original zip file) Suspended CPDN Monitor - Suspend request from BOINC... 10:31:08 (3368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:41:32 (5960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 10:52:55 (5128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9112, iMonCtr=1 Model crash detected, will try to restart... 09:38:29 (4800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:38:30 (4800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:43:53 (6396): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 07:26:51 (2608): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3244, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:28:47 (4588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:36:47 (6016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8752, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:35:07 (9344): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9008, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x76F7748F read attempt to address 0x994E992E Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77926E5F read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
07 Jul 2012 19:26:00 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 777,600 | 1,526,878 | 1.9636 |
06 Jul 2012 14:10:11 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 751,680 | 1,474,961 | 1.9622 |
04 Jul 2012 01:45:31 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 725,760 | 1,425,700 | 1.9644 |
03 Jul 2012 02:57:04 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 699,840 | 1,375,593 | 1.9656 |
02 Jul 2012 20:55:44 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 673,920 | 1,325,851 | 1.9674 |
02 Jul 2012 20:55:44 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 648,000 | 1,275,575 | 1.9685 |
30 Jun 2012 08:36:47 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 622,080 | 1,225,807 | 1.9705 |
29 Jun 2012 08:53:06 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 596,160 | 1,174,503 | 1.9701 |
28 Jun 2012 16:42:16 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 570,240 | 1,122,444 | 1.9684 |
27 Jun 2012 16:32:29 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 544,320 | 1,070,208 | 1.9661 |
27 Jun 2012 00:09:40 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 518,400 | 1,018,197 | 1.9641 |
26 Jun 2012 00:15:24 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 492,480 | 965,842 | 1.9612 |
24 Jun 2012 20:07:02 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 466,560 | 913,545 | 1.9580 |
24 Jun 2012 05:05:58 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 440,640 | 861,804 | 1.9558 |
23 Jun 2012 14:29:46 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 414,720 | 811,662 | 1.9571 |
23 Jun 2012 00:08:58 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 388,800 | 761,566 | 1.9588 |
21 Jun 2012 05:18:02 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 362,880 | 710,693 | 1.9585 |
20 Jun 2012 14:09:39 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 336,960 | 658,920 | 1.9555 |
19 Jun 2012 14:27:31 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 311,040 | 608,524 | 1.9564 |
18 Jun 2012 16:51:40 | 775427 | 14778089 | hadcm3n_yjwt_1980_40_007999712_2 | 285,120 | 558,501 | 1.9588 |
©2024 climateprediction.net