Name | hadam3p_pnw_bpau_1990_1_008030844_1 |
Workunit | 8185958 |
Created | 6 Jul 2012, 16:24:56 UTC |
Sent | 6 Jul 2012, 16:24:58 UTC |
Report deadline | 18 Jun 2013, 21:44:58 UTC |
Received | 13 Jul 2012, 19:52:12 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1205594 |
Run time | 4 days 0 hours 41 min 3 sec |
CPU time | 3 days 9 hours 56 min 4 sec |
Validate state | Invalid |
Credit | 2,254.93 |
Device peak FLOPS | 3.16 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Pacific North West v6.09 windows_intelx86 |
Stderr | <core_client_version>7.0.25</core_client_version> <![CDATA[ <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4140, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4148, selfPID=4784, iMonCtr=1 Model crash detected, will try to restart... 18:33:41 (2812): No heartbeat from core client for 30 sec - exiting 18:33:42 (2812): No heartbeat from core client for 30 sec - exiting 18:33:43 (2812): No heartbeat from core client for 30 sec - exiting 18:33:44 (2812): No heartbeat from core client for 30 sec - exiting 18:33:45 (2812): No heartbeat from core client for 30 sec - exiting 18:33:46 (2812): No heartbeat from core client for 30 sec - exiting 18:33:48 (2812): No heartbeat from core client for 30 sec - exiting 18:33:49 (2812): No heartbeat from core client for 30 sec - exiting 18:33:50 (2812): No heartbeat from core client for 30 sec - exiting 18:33:51 (2812): No heartbeat from core client for 30 sec - exiting 18:33:52 (2812): No heartbeat from core client for 30 sec - exiting 18:33:53 (2812): No heartbeat from core client for 30 sec - exiting 18:33:54 (2812): No heartbeat from core client for 30 sec - exiting 18:33:55 (2812): No heartbeat from core client for 30 sec - exiting 18:33:56 (2812): No heartbeat from core client for 30 sec - exiting 18:33:57 (2812): No heartbeat from core client for 30 sec - exiting 18:33:58 (2812): No heartbeat from core client for 30 sec - exiting 18:34:00 (2812): No heartbeat from core client for 30 sec - exiting 18:34:01 (2812): No heartbeat from core client for 30 sec - exiting 18:34:02 (2812): No heartbeat from core client for 30 sec - exiting 18:34:03 (2812): No heartbeat from core client for 30 sec - exiting 18:34:04 (2812): No heartbeat from core client for 30 sec - exiting 18:34:05 (2812): No heartbeat from core client for 30 sec - exiting 18:34:06 (2812): No heartbeat from core client for 30 sec - exiting 18:34:07 (2812): No heartbeat from core client for 30 sec - exiting 18:34:08 (2812): No heartbeat from core client for 30 sec - exiting 18:34:09 (2812): No heartbeat from core client for 30 sec - exiting 18:34:11 (2812): No heartbeat from core client for 30 sec - exiting 18:34:12 (2812): No heartbeat from core client for 30 sec - exiting 18:34:13 (2812): No heartbeat from core client for 30 sec - exiting 18:34:14 (2812): No heartbeat from core client for 30 sec - exiting 18:34:15 (2812): No heartbeat from core client for 30 sec - exiting 18:34:16 (2812): No heartbeat from core client for 30 sec - exiting 18:34:17 (2812): No heartbeat from core client for 30 sec - exiting 18:34:18 (2812): No heartbeat from core client for 30 sec - exiting 18:34:19 (2812): No heartbeat from core client for 30 sec - exiting 18:34:21 (2812): No heartbeat from core client for 30 sec - exiting 18:34:22 (2812): No heartbeat from core client for 30 sec - exiting 18:34:23 (2812): No heartbeat from core client for 30 sec - exiting 18:34:24 (2812): No heartbeat from core client for 30 sec - exiting 18:34:25 (2812): No heartbeat from core client for 30 sec - exiting 18:34:26 (2812): No heartbeat from core client for 30 sec - exiting 18:34:27 (2812): No heartbeat from core client for 30 sec - exiting 18:34:28 (2812): No heartbeat from core client for 30 sec - exiting 18:34:29 (2812): No heartbeat from core client for 30 sec - exiting 18:34:31 (2812): No heartbeat from core client for 30 sec - exiting 18:34:32 (2812): No heartbeat from core client for 30 sec - exiting 18:34:33 (2812): No heartbeat from core client for 30 sec - exiting 18:34:34 (2812): No heartbeat from core client for 30 sec - exiting 18:34:35 (2812): No heartbeat from core client for 30 sec - exiting 18:34:36 (2812): No heartbeat from core client for 30 sec - exiting 18:34:37 (2812): No heartbeat from core client for 30 sec - exiting 18:34:38 (2812): No heartbeat from core client for 30 sec - exiting 18:34:39 (2812): No heartbeat from core client for 30 sec - exiting 18:34:40 (2812): No heartbeat from core client for 30 sec - exiting 18:34:41 (2812): No heartbeat from core client for 30 sec - exiting 18:34:43 (2812): No heartbeat from core client for 30 sec - exiting 18:34:44 (2812): No heartbeat from core client for 30 sec - exiting 18:34:45 (2812): No heartbeat from core client for 30 sec - exiting 18:34:46 (2812): No heartbeat from core client for 30 sec - exiting 18:34:47 (2812): No heartbeat from core client for 30 sec - exiting 18:35:21 (2812): No heartbeat from core client for 30 sec - exiting 18:35:22 (2812): No heartbeat from core client for 30 sec - exiting 18:35:23 (2812): No heartbeat from core client for 30 sec - exiting 18:35:24 (2812): No heartbeat from core client for 30 sec - exiting 18:35:25 (2812): No heartbeat from core client for 30 sec - exiting 18:35:26 (2812): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:48:25 (3052): No heartbeat from core client for 30 sec - exiting 12:48:26 (3052): No heartbeat from core client for 30 sec - exiting 12:48:27 (3052): No heartbeat from core client for 30 sec - exiting 12:48:29 (3052): No heartbeat from core client for 30 sec - exiting 12:48:30 (3052): No heartbeat from core client for 30 sec - exiting 12:48:31 (3052): No heartbeat from core client for 30 sec - exiting 12:48:32 (3052): No heartbeat from core client for 30 sec - exiting 12:48:33 (3052): No heartbeat from core client for 30 sec - exiting 12:48:34 (3052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3852, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3796, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=964, iMonCtr=2 18:44:20 (3864): No heartbeat from core client for 30 sec - exiting 18:44:21 (3864): No heartbeat from core client for 30 sec - exiting 18:44:22 (3864): No heartbeat from core client for 30 sec - exiting 18:44:23 (3864): No heartbeat from core client for 30 sec - exiting 18:44:24 (3864): No heartbeat from core client for 30 sec - exiting 18:44:25 (3864): No heartbeat from core client for 30 sec - exiting 18:44:26 (3864): No heartbeat from core client for 30 sec - exiting 18:44:27 (3864): No heartbeat from core client for 30 sec - exiting 18:44:29 (3864): No heartbeat from core client for 30 sec - exiting 18:44:30 (3864): No heartbeat from core client for 30 sec - exiting 18:44:31 (3864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2356, selfPID=1772, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3208, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3336, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 9 Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1468, selfPID=3092, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 9 cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_pnw_bpau_1990_1_008030844/dataout/atmos_restart.day after 11 attempts forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_pnw_bpau_1990_1_008030844\tmp\xaakm.namelists Image PC Routine Line Source hadam3p_pnw_um_6. 00FCA39A Unknown Unknown Unknown hadam3p_pnw_um_6. 00F72CD0 Unknown Unknown Unknown hadam3p_pnw_um_6. 00F71E9A Unknown Unknown Unknown hadam3p_pnw_um_6. 00F52819 Unknown Unknown Unknown hadam3p_pnw_um_6. 00E52287 Unknown Unknown Unknown hadam3p_pnw_um_6. 00EEE7B2 Unknown Unknown Unknown hadam3p_pnw_um_6. 00EEF2DA Unknown Unknown Unknown hadam3p_pnw_um_6. 00C69BD2 Unknown Unknown Unknown hadam3p_pnw_um_6. 00FAE638 Unknown Unknown Unknown kernel32.dll 750C339A Unknown Unknown Unknown ntdll.dll 77409EF2 Unknown Unknown Unknown ntdll.dll 77409EC5 Unknown Unknown Unknown forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_pnw_bpau_1990_1_008030844\tmp\xaakg.namelists Image PC Routine Line Source hadrm3p_pnw_um_6. 0136C52A Unknown Unknown Unknown hadrm3p_pnw_um_6. 01314460 Unknown Unknown Unknown hadrm3p_pnw_um_6. 0131362A Unknown Unknown Unknown hadrm3p_pnw_um_6. 012F2469 Unknown Unknown Unknown hadrm3p_pnw_um_6. 011F66EB Unknown Unknown Unknown hadrm3p_pnw_um_6. 01292AE2 Unknown Unknown Unknown hadrm3p_pnw_um_6. 012935AF Unknown Unknown Unknown hadrm3p_pnw_um_6. 01039860 Unknown Unknown Unknown hadrm3p_pnw_um_6. 01350893 Unknown Unknown Unknown kernel32.dll 750C339A Unknown Unknown Unknown ntdll.dll 77409EF2 Unknown Unknown Unknown ntdll.dll 77409EC5 Unknown Unknown Unknown Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3876, selfPID=2748, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 7 Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_pnw_bpau_1990_1_008030844_1_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_bpau_1990_1_008030844_1_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_bpau_1990_1_008030844_1_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
13 Jul 2012 07:39:17 | 1205594 | 14872151 | hadam3p_pnw_bpau_1990_1_008030844_1 | 103,776 | 267,671 | 2.5793 |
12 Jul 2012 20:00:43 | 1205594 | 14872151 | hadam3p_pnw_bpau_1990_1_008030844_1 | 92,256 | 233,355 | 2.5294 |
12 Jul 2012 07:16:35 | 1205594 | 14872151 | hadam3p_pnw_bpau_1990_1_008030844_1 | 80,736 | 201,586 | 2.4969 |
11 Jul 2012 09:08:44 | 1205594 | 14872151 | hadam3p_pnw_bpau_1990_1_008030844_1 | 69,216 | 170,250 | 2.4597 |
10 Jul 2012 11:21:00 | 1205594 | 14872151 | hadam3p_pnw_bpau_1990_1_008030844_1 | 57,696 | 140,715 | 2.4389 |
09 Jul 2012 07:44:36 | 1205594 | 14872151 | hadam3p_pnw_bpau_1990_1_008030844_1 | 46,176 | 111,818 | 2.4216 |
08 Jul 2012 23:28:07 | 1205594 | 14872151 | hadam3p_pnw_bpau_1990_1_008030844_1 | 34,656 | 84,136 | 2.4277 |
08 Jul 2012 13:45:54 | 1205594 | 14872151 | hadam3p_pnw_bpau_1990_1_008030844_1 | 23,136 | 56,511 | 2.4426 |
07 Jul 2012 17:10:14 | 1205594 | 14872151 | hadam3p_pnw_bpau_1990_1_008030844_1 | 11,616 | 29,046 | 2.5005 |
©2024 climateprediction.net