Name | hadcm3n_yjor_1900_40_007523353_4 |
Workunit | 7720828 |
Created | 25 Nov 2011, 17:24:20 UTC |
Sent | 25 Nov 2011, 17:38:29 UTC |
Report deadline | 25 Feb 2012, 1:05:40 UTC |
Received | 14 Feb 2012, 17:21:13 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS |
Computer ID | 1163731 |
Run time | 29 days 2 hours 53 min 40 sec |
CPU time | 24 days 9 hours 36 min 48 sec |
Validate state | Invalid |
Credit | 11,819.52 |
Device peak FLOPS | 2.57 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.12.34</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3344, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3076, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=920, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3012, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3016, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2340, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 19:12:28 (3976): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:56:29 (2208): No heartbeat from core client for 30 sec - exiting 17:56:30 (2208): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 11:29:13 (3428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2928, iMonCtr=1 Model crash detected, will try to restart... C19:15:24 (3448): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2152, iMonCtr=1 Model crash detected, will try to restart... 08:34:18 (3588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:44:30 (3960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:29:07 (3200): No heartbeat from core client for 30 sec - exiting 12:29:08 (3200): No heartbeat from core client for 30 sec - exiting 12:29:09 (3200): No heartbeat from core client for 30 sec - exiting 12:29:10 (3200): No heartbeat from core client for 30 sec - exiting 12:29:11 (3200): No heartbeat from core client for 30 sec - exiting 12:29:12 (3200): No heartbeat from core client for 30 sec - exiting 12:29:13 (3200): No heartbeat from core client for 30 sec - exiting 12:29:14 (3200): No heartbeat from core client for 30 sec - exiting 12:29:15 (3200): No heartbeat from core client for 30 sec - exiting 12:29:16 (3200): No heartbeat from core client for 30 sec - exiting 12:29:17 (3200): No heartbeat from core client for 30 sec - exiting 12:29:18 (3200): No heartbeat from core client for 30 sec - exiting 12:29:19 (3200): No heartbeat from core client for 30 sec - exiting 12:29:20 (3200): No heartbeat from core client for 30 sec - exiting 12:29:21 (3200): No heartbeat from core client for 30 sec - exiting 12:29:22 (3200): No heartbeat from core client for 30 sec - exiting 12:29:23 (3200): No heartbeat from core client for 30 sec - exiting 12:29:24 (3200): No heartbeat from core client for 30 sec - exiting 12:29:25 (3200): No heartbeat from core client for 30 sec - exiting 12:29:26 (3200): No heartbeat from core client for 30 sec - exiting 12:29:27 (3200): No heartbeat from core client for 30 sec - exiting 12:29:28 (3200): No heartbeat from core client for 30 sec - exiting 12:29:29 (3200): No heartbeat from core client for 30 sec - exiting 12:29:30 (3200): No heartbeat from core client for 30 sec - exiting 12:29:31 (3200): No heartbeat from core client for 30 sec - exiting 12:29:32 (3200): No heartbeat from core client for 30 sec - exiting 12:29:33 (3200): No heartbeat from core client for 30 sec - exiting 12:29:34 (3200): No heartbeat from core client for 30 sec - exiting 12:29:35 (3200): No heartbeat from core client for 30 sec - exiting 12:29:36 (3200): No heartbeat from core client for 30 sec - exiting 12:29:37 (3200): No heartbeat from core client for 30 sec - exiting 12:29:38 (3200): No heartbeat from core client for 30 sec - exiting 12:29:39 (3200): No heartbeat from core client for 30 sec - exiting 12:29:40 (3200): No heartbeat from core client for 30 sec - exiting 12:29:41 (3200): No heartbeat from core client for 30 sec - exiting 12:29:42 (3200): No heartbeat from core client for 30 sec - exiting 12:29:44 (3200): No heartbeat from core client for 30 sec - exiting 12:29:45 (3200): No heartbeat from core client for 30 sec - exiting 12:29:46 (3200): No heartbeat from core client for 30 sec - exiting 12:29:47 (3200): No heartbeat from core client for 30 sec - exiting 12:29:48 (3200): No heartbeat from core client for 30 sec - exiting 12:29:49 (3200): No heartbeat from core client for 30 sec - exiting 12:29:50 (3200): No heartbeat from core client for 30 sec - exiting 12:29:51 (3200): No heartbeat from core client for 30 sec - exiting 12:29:52 (3200): No heartbeat from core client for 30 sec - exiting 12:29:53 (3200): No heartbeat from core client for 30 sec - exiting 12:29:54 (3200): No heartbeat from core client for 30 sec - exiting 12:29:55 (3200): No heartbeat from core client for 30 sec - exiting 12:29:56 (3200): No heartbeat from core client for 30 sec - exiting 12:29:57 (3200): No heartbeat from core client for 30 sec - exiting 12:29:58 (3200): No heartbeat from core client for 30 sec - exiting 12:29:59 (3200): No heartbeat from core client for 30 sec - exiting 12:30:00 (3200): No heartbeat from core client for 30 sec - exiting 12:30:01 (3200): No heartbeat from core client for 30 sec - exiting 12:30:02 (3200): No heartbeat from core client for 30 sec - exiting 12:30:03 (3200): No heartbeat from core client for 30 sec - exiting 12:30:04 (3200): No heartbeat from core client for 30 sec - exiting 12:30:05 (3200): No heartbeat from core client for 30 sec - exiting 12:30:06 (3200): No heartbeat from core client for 30 sec - exiting 12:30:07 (3200): No heartbeat from core client for 30 sec - exiting 12:30:08 (3200): No heartbeat from core client for 30 sec - exiting 12:30:09 (3200): No heartbeat from core client for 30 sec - exiting 12:30:10 (3200): No heartbeat from core client for 30 sec - exiting 12:30:11 (3200): No heartbeat from core client for 30 sec - exiting 12:30:12 (3200): No heartbeat from core client for 30 sec - exiting 12:30:13 (3200): No heartbeat from core client for 30 sec - exiting 12:30:14 (3200): No heartbeat from core client for 30 sec - exiting 12:30:15 (3200): No heartbeat from core client for 30 sec - exiting 12:30:16 (3200): No heartbeat from core client for 30 sec - exiting 12:30:17 (3200): No heartbeat from core client for 30 sec - exiting 12:30:18 (3200): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:30:19 (3200): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1076, iMonCtr=1 Model crash detected, will try to restart... Atmos Hold Restart file rename failed on atmos_restart.hold Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3652, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3300, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 12:52:54 (4496): No heartbeat from core client for 30 sec - exiting 12:52:55 (4496): No heartbeat from core client for 30 sec - exiting 12:52:56 (4496): No heartbeat from core client for 30 sec - exiting 12:52:57 (4496): No heartbeat from core client for 30 sec - exiting 12:52:59 (4496): No heartbeat from core client for 30 sec - exiting 12:53:00 (4496): No heartbeat from core client for 30 sec - exiting 12:53:01 (4496): No heartbeat from core client for 30 sec - exiting 12:53:02 (4496): No heartbeat from core client for 30 sec - exiting 12:53:03 (4496): No heartbeat from core client for 30 sec - exiting 12:53:04 (4496): No heartbeat from core client for 30 sec - exiting 12:53:05 (4496): No heartbeat from core client for 30 sec - exiting 12:53:06 (4496): No heartbeat from core client for 30 sec - exiting 12:53:07 (4496): No heartbeat from core client for 30 sec - exiting 12:53:08 (4496): No heartbeat from core client for 30 sec - exiting 12:53:09 (4496): No heartbeat from core client for 30 sec - exiting 12:53:10 (4496): No heartbeat from core client for 30 sec - exiting 12:53:11 (4496): No heartbeat from core client for 30 sec - exiting 12:53:13 (4496): No heartbeat from core client for 30 sec - exiting 12:53:14 (4496): No heartbeat from core client for 30 sec - exiting 12:53:15 (4496): No heartbeat from core client for 30 sec - exiting 12:53:16 (4496): No heartbeat from core client for 30 sec - exiting 12:53:17 (4496): No heartbeat from core client for 30 sec - exiting 12:53:18 (4496): No heartbeat from core client for 30 sec - exiting 12:53:19 (4496): No heartbeat from core client for 30 sec - exiting 12:53:20 (4496): No heartbeat from core client for 30 sec - exiting 12:53:21 (4496): No heartbeat from core client for 30 sec - exiting 12:53:22 (4496): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1232, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1232, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 10:48:38 (3796): No heartbeat from core client for 30 sec - exiting 10:48:39 (3796): No heartbeat from core client for 30 sec - exiting 10:48:40 (3796): No heartbeat from core client for 30 sec - exiting 10:48:41 (3796): No heartbeat from core client for 30 sec - exiting 10:48:43 (3796): No heartbeat from core client for 30 sec - exiting 10:48:44 (3796): No heartbeat from core client for 30 sec - exiting 10:48:45 (3796): No heartbeat from core client for 30 sec - exiting 10:48:46 (3796): No heartbeat from core client for 30 sec - exiting 10:48:47 (3796): No heartbeat from core client for 30 sec - exiting 10:48:48 (3796): No heartbeat from core client for 30 sec - exiting 10:48:49 (3796): No heartbeat from core client for 30 sec - exiting 10:48:50 (3796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
10 Feb 2012 12:52:39 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 984,960 | 2,083,533 | 2.1153 |
06 Feb 2012 17:06:11 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 959,040 | 2,028,192 | 2.1148 |
04 Feb 2012 16:28:58 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 933,120 | 1,972,397 | 2.1138 |
03 Feb 2012 11:16:36 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 907,200 | 1,916,751 | 2.1128 |
01 Feb 2012 16:08:13 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 881,280 | 1,861,482 | 2.1122 |
31 Jan 2012 14:16:23 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 855,360 | 1,805,977 | 2.1114 |
29 Jan 2012 13:07:26 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 829,440 | 1,752,701 | 2.1131 |
27 Jan 2012 14:20:54 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 803,520 | 1,696,553 | 2.1114 |
25 Jan 2012 18:24:29 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 777,600 | 1,641,563 | 2.1111 |
24 Jan 2012 15:05:08 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 751,680 | 1,586,871 | 2.1111 |
22 Jan 2012 19:11:02 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 725,760 | 1,531,943 | 2.1108 |
21 Jan 2012 14:22:01 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 699,840 | 1,477,306 | 2.1109 |
19 Jan 2012 16:18:33 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 673,920 | 1,422,695 | 2.1111 |
17 Jan 2012 20:56:10 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 648,000 | 1,368,216 | 2.1114 |
15 Jan 2012 14:16:34 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 622,080 | 1,313,493 | 2.1115 |
13 Jan 2012 20:27:22 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 596,160 | 1,259,332 | 2.1124 |
12 Jan 2012 19:59:20 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 570,240 | 1,206,967 | 2.1166 |
08 Jan 2012 20:34:42 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 544,320 | 1,152,001 | 2.1164 |
07 Jan 2012 16:33:52 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 518,400 | 1,097,925 | 2.1179 |
05 Jan 2012 17:16:47 | 1163731 | 13661669 | hadcm3n_yjor_1900_40_007523353_4 | 492,480 | 1,043,274 | 2.1184 |
©2024 climateprediction.net