Name | hadam3p_saf_26pu_1975_1_007232638_0 |
Workunit | 7430878 |
Created | 29 Apr 2011, 1:57:23 UTC |
Sent | 10 May 2011, 21:51:11 UTC |
Report deadline | 22 Apr 2012, 3:11:11 UTC |
Received | 17 May 2011, 23:07:48 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS |
Computer ID | 1106381 |
Run time | 2 days 5 hours 40 min 20 sec |
CPU time | 1 days 23 hours 34 min 34 sec |
Validate state | Invalid |
Credit | 562.19 |
Device peak FLOPS | 2.13 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Southern Africa v6.09 windows_intelx86 |
Stderr | <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> 22:43:31 (836): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:43:32 (836): No heartbeat from core client for 30 sec - exiting 22:43:33 (836): No heartbeat from core client for 30 sec - exiting 22:43:34 (836): No heartbeat from core client for 30 sec - exiting 22:43:35 (836): No heartbeat from core client for 30 sec - exiting 22:43:36 (836): No heartbeat from core client for 30 sec - exiting 22:43:37 (836): No heartbeat from core client for 30 sec - exiting 22:43:38 (836): No heartbeat from core client for 30 sec - exiting 22:43:39 (836): No heartbeat from core client for 30 sec - exiting 22:43:40 (836): No heartbeat from core client for 30 sec - exiting 22:43:41 (836): No heartbeat from core client for 30 sec - exiting 22:43:42 (836): No heartbeat from core client for 30 sec - exiting 22:43:43 (836): No heartbeat from core client for 30 sec - exiting 22:43:44 (836): No heartbeat from core client for 30 sec - exiting 22:43:45 (836): No heartbeat from core client for 30 sec - exiting 22:43:46 (836): No heartbeat from core client for 30 sec - exiting 22:43:47 (836): No heartbeat from core client for 30 sec - exiting 22:43:48 (836): No heartbeat from core client for 30 sec - exiting 22:43:49 (836): No heartbeat from core client for 30 sec - exiting 22:43:50 (836): No heartbeat from core client for 30 sec - exiting 22:43:51 (836): No heartbeat from core client for 30 sec - exiting Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4408, selfPID=1588, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4408, selfPID=4408, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=948, selfPID=1536, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=948, selfPID=948, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4592, selfPID=2064, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4592, selfPID=4592, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4640, selfPID=3184, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4640, selfPID=4640, iMonCtr=1 17:19:26 (2152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:19:27 (2152): No heartbeat from core client for 30 sec - exiting 17:19:28 (2152): No heartbeat from core client for 30 sec - exiting 17:19:29 (2152): No heartbeat from core client for 30 sec - exiting 17:19:30 (2152): No heartbeat from core client for 30 sec - exiting 17:19:31 (2152): No heartbeat from core client for 30 sec - exiting 17:19:32 (2152): No heartbeat from core client for 30 sec - exiting 17:19:33 (2152): No heartbeat from core client for 30 sec - exiting 17:19:34 (2152): No heartbeat from core client for 30 sec - exiting 17:19:35 (2152): No heartbeat from core client for 30 sec - exiting 17:19:36 (2152): No heartbeat from core client for 30 sec - exiting 17:19:37 (2152): No heartbeat from core client for 30 sec - exiting 17:19:38 (2152): No heartbeat from core client for 30 sec - exiting 17:19:39 (2152): No heartbeat from core client for 30 sec - exiting 17:19:40 (2152): No heartbeat from core client for 30 sec - exiting 17:19:41 (2152): No heartbeat from core client for 30 sec - exiting 17:19:42 (2152): No heartbeat from core client for 30 sec - exiting 17:19:43 (2152): No heartbeat from core client for 30 sec - exiting 17:19:44 (2152): No heartbeat from core client for 30 sec - exiting 17:19:45 (2152): No heartbeat from core client for 30 sec - exiting 17:19:46 (2152): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2860, selfPID=2860, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2860, selfPID=4240, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4680, selfPID=3688, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4680, selfPID=4680, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4772, selfPID=4916, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4772, selfPID=4772, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2404, selfPID=4564, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2404, selfPID=2404, iMonCtr=1 CPDN Monitor - Quit request from BOINC... No Process Handle Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4460, selfPID=4492, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4460, selfPID=4460, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4736, selfPID=4724, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4736, selfPID=4736, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1272, selfPID=3780, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1272, selfPID=1272, iMonCtr=1 23:18:25 (4656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:18:26 (4656): No heartbeat from core client for 30 sec - exiting 23:18:27 (4656): No heartbeat from core client for 30 sec - exiting 23:18:28 (4656): No heartbeat from core client for 30 sec - exiting 23:18:29 (4656): No heartbeat from core client for 30 sec - exiting 23:18:30 (4656): No heartbeat from core client for 30 sec - exiting 23:18:31 (4656): No heartbeat from core client for 30 sec - exiting 23:18:32 (4656): No heartbeat from core client for 30 sec - exiting 23:18:33 (4656): No heartbeat from core client for 30 sec - exiting 23:18:34 (4656): No heartbeat from core client for 30 sec - exiting 23:18:35 (4656): No heartbeat from core client for 30 sec - exiting 23:18:36 (4656): No heartbeat from core client for 30 sec - exiting 23:18:37 (4656): No heartbeat from core client for 30 sec - exiting 23:18:38 (4656): No heartbeat from core client for 30 sec - exiting 23:18:39 (4656): No heartbeat from core client for 30 sec - exiting 23:18:40 (4656): No heartbeat from core client for 30 sec - exiting 23:18:41 (4656): No heartbeat from core client for 30 sec - exiting 23:18:42 (4656): No heartbeat from core client for 30 sec - exiting 23:18:43 (4656): No heartbeat from core client for 30 sec - exiting 23:18:44 (4656): No heartbeat from core client for 30 sec - exiting 23:18:45 (4656): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4604, selfPID=4604, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4604, selfPID=2908, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1400, selfPID=4648, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1400, selfPID=1400, iMonCtr=1 CPDN Monitor - Quit request from BOINC... No Process Handle Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4332, selfPID=4292, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4332, selfPID=4332, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1216, selfPID=4564, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1216, selfPID=1216, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1480, selfPID=4016, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1480, selfPID=1480, iMonCtr=1 01:00:04 (4136): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:00:05 (4136): No heartbeat from core client for 30 sec - exiting 01:00:06 (4136): No heartbeat from core client for 30 sec - exiting 01:00:07 (4136): No heartbeat from core client for 30 sec - exiting 01:00:08 (4136): No heartbeat from core client for 30 sec - exiting 01:00:10 (4136): No heartbeat from core client for 30 sec - exiting 01:00:11 (4136): No heartbeat from core client for 30 sec - exiting 01:00:12 (4136): No heartbeat from core client for 30 sec - exiting 01:00:13 (4136): No heartbeat from core client for 30 sec - exiting 01:00:14 (4136): No heartbeat from core client for 30 sec - exiting 01:00:15 (4136): No heartbeat from core client for 30 sec - exiting 01:00:16 (4136): No heartbeat from core client for 30 sec - exiting 01:00:17 (4136): No heartbeat from core client for 30 sec - exiting 01:00:18 (4136): No heartbeat from core client for 30 sec - exiting 01:00:19 (4136): No heartbeat from core client for 30 sec - exiting No Process Handle Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4804, selfPID=3696, iMonCtr=1 Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0 CPDN Monitor - Quit request from BOINC... No Process Handle Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1956, selfPID=2744, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1956, selfPID=1956, iMonCtr=1 CPDN Monitor - Quit request from BOINC... No Process Handle Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4416, selfPID=748, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4416, selfPID=4416, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4584, selfPID=4688, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4584, selfPID=4584, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4752, selfPID=2392, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4752, selfPID=4752, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4616, selfPID=4500, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4616, selfPID=4616, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=936, selfPID=2528, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=936, selfPID=936, iMonCtr=1 CPDN Monitor - Quit request from BOINC... No Process Handle Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3688, selfPID=3676, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3688, selfPID=3688, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=504, selfPID=3676, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=504, selfPID=504, iMonCtr=1 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1060, selfPID=332, iMonCtr=1 No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1060, selfPID=1060, iMonCtr=1 </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
16 May 2011 17:55:05 | 1106381 | 12842974 | hadam3p_saf_26pu_1975_1_007232638_0 | 34,656 | 141,088 | 4.0711 |
15 May 2011 07:11:12 | 1106381 | 12842974 | hadam3p_saf_26pu_1975_1_007232638_0 | 23,136 | 93,470 | 4.0400 |
13 May 2011 04:15:04 | 1106381 | 12842974 | hadam3p_saf_26pu_1975_1_007232638_0 | 11,616 | 47,286 | 4.0708 |
©2024 climateprediction.net