Name | hadcm3n_u1we_1980_40_007461009_4 |
Workunit | 7658512 |
Created | 23 Sep 2011, 15:09:54 UTC |
Sent | 23 Sep 2011, 15:27:31 UTC |
Report deadline | 23 Dec 2011, 22:54:42 UTC |
Received | 16 Nov 2011, 22:17:26 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1151942 |
Run time | 10 days 16 hours 34 min 47 sec |
CPU time | 7 days 5 hours 41 min 50 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 2.23 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 i686-pc-linux-gnu |
Stderr | <core_client_version>6.10.17</core_client_version> <![CDATA[ <message> process exited with code 22 (0x16, -234) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... 03:11:16 (10498): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:11:22 (10498): No heartbeat from core client for 30 sec - exiting 03:16:23 (5800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:21:15 (5932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:27:32 (6027): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:58:37 (796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:34:58 (11845): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:36:31 (13456): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:59:55 (13522): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:16:18 (24015): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:18:36 (25388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:47:19 (25961): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:47:20 (25961): No heartbeat from core client for 30 sec - exiting 02:47:21 (25961): No heartbeat from core client for 30 sec - exiting 02:47:22 (25961): No heartbeat from core client for 30 sec - exiting 02:47:23 (25961): No heartbeat from core client for 30 sec - exiting 02:47:24 (25961): No heartbeat from core client for 30 sec - exiting 02:47:25 (25961): No heartbeat from core client for 30 sec - exiting 03:03:25 (29138): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:45:32 (31320): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:45:33 (31320): No heartbeat from core client for 30 sec - exiting 03:45:34 (31320): No heartbeat from core client for 30 sec - exiting 03:45:35 (31320): No heartbeat from core client for 30 sec - exiting 03:45:36 (31320): No heartbeat from core client for 30 sec - exiting 03:45:37 (31320): No heartbeat from core client for 30 sec - exiting 03:45:38 (31320): No heartbeat from core client for 30 sec - exiting 03:45:39 (31320): No heartbeat from core client for 30 sec - exiting 03:45:40 (31320): No heartbeat from core client for 30 sec - exiting 03:45:41 (31320): No heartbeat from core client for 30 sec - exiting 03:45:42 (31320): No heartbeat from core client for 30 sec - exiting 06:05:11 (4537): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 1 Model crashed: WRITHEAD: I/O error tmp/pipe_dummy 2048 forrtl: No space left on device forrtl: severe (38): error during write, unit 6, file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/stdout_um.txt Image PC Routine Line Source hadcm3n_um_6.07_i 0848EB7D Unknown Unknown Unknown hadcm3n_um_6.07_i 0848D975 Unknown Unknown Unknown hadcm3n_um_6.07_i 0845F3CF Unknown Unknown Unknown hadcm3n_um_6.07_i 0841F90D Unknown Unknown Unknown hadcm3n_um_6.07_i 0841F257 Unknown Unknown Unknown hadcm3n_um_6.07_i 08451069 Unknown Unknown Unknown hadcm3n_um_6.07_i 0844E937 Unknown Unknown Unknown hadcm3n_um_6.07_i 0836D10D Unknown Unknown Unknown hadcm3n_um_6.07_i 082EB086 Unknown Unknown Unknown hadcm3n_um_6.07_i 0838F66D Unknown Unknown Unknown hadcm3n_um_6.07_i 0839BDF8 Unknown Unknown Unknown libc.so.6 F75F0BD6 Unknown Unknown Unknown hadcm3n_um_6.07_i 0804CB11 Unknown Unknown Unknown Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=21818, iMonCtr=1 Model crash detected, will try to restart... BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 1 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 forrtl: No space left on device forrtl: severe (38): error during write, unit 6, file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/stdout_um.txt Image PC Routine Line Source hadcm3n_um_6.07_i 0848EB7D Unknown Unknown Unknown hadcm3n_um_6.07_i 0848D975 Unknown Unknown Unknown hadcm3n_um_6.07_i 0845F3CF Unknown Unknown Unknown hadcm3n_um_6.07_i 0841F90D Unknown Unknown Unknown hadcm3n_um_6.07_i 0841F257 Unknown Unknown Unknown hadcm3n_um_6.07_i 08455448 Unknown Unknown Unknown hadcm3n_um_6.07_i 0829D502 Unknown Unknown Unknown hadcm3n_um_6.07_i 08340D81 Unknown Unknown Unknown hadcm3n_um_6.07_i 08194301 Unknown Unknown Unknown hadcm3n_um_6.07_i 08391957 Unknown Unknown Unknown hadcm3n_um_6.07_i 0838F8B7 Unknown Unknown Unknown hadcm3n_um_6.07_i 0839BDF8 Unknown Unknown Unknown libc.so.6 F75A6BD6 Unknown Unknown Unknown hadcm3n_um_6.07_i 0804CB11 Unknown Unknown Unknown Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=21818, iMonCtr=1 Model crash detected, will try to restart... BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 1 Model crashed: WRITHEAD: I/O error tmp/pipe_dummy 2048 08:51:07 (21818): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 03:21:04 (20133): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:23:21 (30288): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:38:37 (30463): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:38:38 (30463): No heartbeat from core client for 30 sec - exiting 03:12:36 (21040): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:30:37 (8383): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:07:18 (11801): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:07:20 (11801): No heartbeat from core client for 30 sec - exiting 13:08:50 (7343): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:12:05 (8366): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:16:08 (9401): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:37:16 (9699): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:39:06 (13877): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:41:25 (14796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:54:50 (15118): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:57:56 (18142): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:01:13 (18832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:18:02 (20222): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:18:03 (20222): No heartbeat from core client for 30 sec - exiting 14:18:04 (20222): No heartbeat from core client for 30 sec - exiting 14:18:05 (20222): No heartbeat from core client for 30 sec - exiting 14:18:06 (20222): No heartbeat from core client for 30 sec - exiting 14:18:07 (20222): No heartbeat from core client for 30 sec - exiting 14:18:08 (20222): No heartbeat from core client for 30 sec - exiting 14:18:09 (20222): No heartbeat from core client for 30 sec - exiting 14:18:10 (20222): No heartbeat from core client for 30 sec - exiting 14:18:11 (20222): No heartbeat from core client for 30 sec - exiting 14:18:12 (20222): No heartbeat from core client for 30 sec - exiting 14:18:13 (20222): No heartbeat from core client for 30 sec - exiting 14:18:14 (20222): No heartbeat from core client for 30 sec - exiting 14:18:15 (20222): No heartbeat from core client for 30 sec - exiting 14:18:16 (20222): No heartbeat from core client for 30 sec - exiting 14:18:17 (20222): No heartbeat from core client for 30 sec - exiting 14:18:18 (20222): No heartbeat from core client for 30 sec - exiting 14:18:19 (20222): No heartbeat from core client for 30 sec - exiting 14:18:20 (20222): No heartbeat from core client for 30 sec - exiting 14:18:21 (20222): No heartbeat from core client for 30 sec - exiting 14:18:22 (20222): No heartbeat from core client for 30 sec - exiting 14:18:23 (20222): No heartbeat from core client for 30 sec - exiting 14:18:24 (20222): No heartbeat from core client for 30 sec - exiting 14:18:25 (20222): No heartbeat from core client for 30 sec - exiting 14:18:26 (20222): No heartbeat from core client for 30 sec - exiting 14:18:27 (20222): No heartbeat from core client for 30 sec - exiting 14:18:28 (20222): No heartbeat from core client for 30 sec - exiting 14:18:29 (20222): No heartbeat from core client for 30 sec - exiting 14:18:30 (20222): No heartbeat from core client for 30 sec - exiting 14:18:31 (20222): No heartbeat from core client for 30 sec - exiting 14:58:35 (22802): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:59:52 (31158): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:04:03 (32215): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:42:11 (32672): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:44:27 (7540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:22:39 (8152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:24:20 (15852): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:04:35 (16965): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:21:20 (5766): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:21:22 (5766): No heartbeat from core client for 30 sec - exiting 19:27:02 (21462): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:27:03 (21462): No heartbeat from core client for 30 sec - exiting 19:27:04 (21462): No heartbeat from core client for 30 sec - exiting 20:10:44 (22547): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:10:45 (22547): No heartbeat from core client for 30 sec - exiting 20:10:46 (22547): No heartbeat from core client for 30 sec - exiting 20:11:30 (30743): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:06:43 (31042): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:06:58 (31042): No heartbeat from core client for 30 sec - exiting 03:04:06 (28345): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:04:08 (28345): No heartbeat from core client for 30 sec - exiting 03:04:10 (28345): No heartbeat from core client for 30 sec - exiting 03:04:11 (28345): No heartbeat from core client for 30 sec - exiting 03:04:12 (28345): No heartbeat from core client for 30 sec - exiting 03:04:13 (28345): No heartbeat from core client for 30 sec - exiting 03:20:06 (14153): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 03:35:26 (16861): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:35:27 (16861): No heartbeat from core client for 30 sec - exiting 17:55:25 (2210): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:58:34 (22883): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:24:06 (19845): No heartbeat from core client for 30 sec - exiting 18:24:07 (19845): No heartbeat from core client for 30 sec - exiting 18:24:08 (19845): No heartbeat from core client for 30 sec - exiting 18:24:09 (19845): No heartbeat from core client for 30 sec - exiting 18:24:10 (19845): No heartbeat from core client for 30 sec - exiting 18:24:11 (19845): No heartbeat from core client for 30 sec - exiting 18:24:12 (19845): No heartbeat from core client for 30 sec - exiting 18:24:13 (19845): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_u1we_1980_40_007461009/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
16 Nov 2011 14:26:48 | 1151942 | 13415612 | hadcm3n_u1we_1980_40_007461009_4 | 259,200 | 625,260 | 2.4123 |
15 Nov 2011 18:07:58 | 1151942 | 13415612 | hadcm3n_u1we_1980_40_007461009_4 | 233,280 | 564,874 | 2.4214 |
15 Nov 2011 18:07:52 | 1151942 | 13415612 | hadcm3n_u1we_1980_40_007461009_4 | 207,360 | 504,095 | 2.4310 |
15 Nov 2011 18:07:49 | 1151942 | 13415612 | hadcm3n_u1we_1980_40_007461009_4 | 181,440 | 443,538 | 2.4445 |
15 Nov 2011 18:07:58 | 1151942 | 13415612 | hadcm3n_u1we_1980_40_007461009_4 | 155,520 | 381,653 | 2.4540 |
09 Nov 2011 22:49:43 | 1151942 | 13415612 | hadcm3n_u1we_1980_40_007461009_4 | 129,600 | 316,082 | 2.4389 |
27 Sep 2011 01:07:56 | 1151942 | 13415612 | hadcm3n_u1we_1980_40_007461009_4 | 103,680 | 260,225 | 2.5099 |
26 Sep 2011 04:47:19 | 1151942 | 13415612 | hadcm3n_u1we_1980_40_007461009_4 | 77,760 | 196,665 | 2.5291 |
25 Sep 2011 08:50:42 | 1151942 | 13415612 | hadcm3n_u1we_1980_40_007461009_4 | 51,840 | 130,368 | 2.5148 |
24 Sep 2011 12:54:24 | 1151942 | 13415612 | hadcm3n_u1we_1980_40_007461009_4 | 25,920 | 64,081 | 2.4723 |
©2024 climateprediction.net