Task 15548048

Name	hadcm3n_o147_2140_40_008269339_1
Workunit	8424463
Created	17 Jan 2013, 5:18:10 UTC
Sent	17 Jan 2013, 5:18:24 UTC
Report deadline	18 Apr 2013, 12:45:35 UTC
Received	6 Mar 2013, 18:27:19 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1091586
Run time	21 days 23 hours 34 min 28 sec
CPU time	20 days 13 hours 43 min 5 sec
Validate state	Invalid
Credit	7,776.00
Device peak FLOPS	1.72 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7556, iMonCtr=1 Model crash detected, will try to restart... 18:16:01 (7868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:06:45 (7788): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:08:22 (2904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:57:50 (6372): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:02:46 (7172): No heartbeat from core client for 30 sec - exiting 13:02:47 (7172): No heartbeat from core client for 30 sec - exiting 13:02:48 (7172): No heartbeat from core client for 30 sec - exiting 13:02:49 (7172): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:03:26 (6512): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:45:06 (7684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:45:43 (4592): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:15:54 (6464): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 21:15:14 (8548): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 15:59:56 (8060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:55:45 (7616): No heartbeat from core client for 30 sec - exiting 19:55:47 (7616): No heartbeat from core client for 30 sec - exiting 19:55:48 (7616): No heartbeat from core client for 30 sec - exiting 19:55:49 (7616): No heartbeat from core client for 30 sec - exiting 19:55:50 (7616): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... MainError: 02:01:09 AM No files match the supplied pattern. MainError: 02:01:09 AM No files match the supplied pattern. MainError: 10:56:11 PM No files match the supplied pattern. MainError: 10:56:11 PM No files match the supplied pattern. MainError: 07:46:55 PM No files match the supplied pattern. MainError: 07:46:55 PM No files match the supplied pattern. MainError: 04:31:55 PM No files match the supplied pattern. MainError: 04:31:55 PM No files match the supplied pattern. 21:49:03 (8752): No heartbeat from core client for 30 sec - exiting 21:49:05 (8752): No heartbeat from core client for 30 sec - exiting 21:49:06 (8752): No heartbeat from core client for 30 sec - exiting 21:49:07 (8752): No heartbeat from core client for 30 sec - exiting 21:49:08 (8752): No heartbeat from core client for 30 sec - exiting 21:49:09 (8752): No heartbeat from core client for 30 sec - exiting 21:49:10 (8752): No heartbeat from core client for 30 sec - exiting 21:49:11 (8752): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:59:29 (2820): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:59:30 (2820): No heartbeat from core client for 30 sec - exiting 21:59:31 (2820): No heartbeat from core client for 30 sec - exiting 21:59:32 (2820): No heartbeat from core client for 30 sec - exiting 21:59:33 (2820): No heartbeat from core client for 30 sec - exiting 21:59:34 (2820): No heartbeat from core client for 30 sec - exiting 21:59:35 (2820): No heartbeat from core client for 30 sec - exiting 21:59:36 (2820): No heartbeat from core client for 30 sec - exiting 21:59:37 (2820): No heartbeat from core client for 30 sec - exiting 21:59:39 (2820): No heartbeat from core client for 30 sec - exiting 21:59:40 (2820): No heartbeat from core client for 30 sec - exiting 23:16:10 (7100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9028, iMonCtr=1 Model crash detected, will try to restart... 23:28:57 (7072): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:25:23 (4848): No heartbeat from core client for 30 sec - exiting 12:25:24 (4848): No heartbeat from core client for 30 sec - exiting 12:25:25 (4848): No heartbeat from core client for 30 sec - exiting 12:25:26 (4848): No heartbeat from core client for 30 sec - exiting 12:25:27 (4848): No heartbeat from core client for 30 sec - exiting 12:25:29 (4848): No heartbeat from core client for 30 sec - exiting 12:25:30 (4848): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:25:31 (4848): No heartbeat from core client for 30 sec - exiting MainError: 08:56:16 PM No files match the supplied pattern. MainError: 08:56:16 PM No files match the supplied pattern. 16:08:26 (8136): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8256, iMonCtr=1 Model crash detected, will try to restart... MainError: 07:55:01 PM No files match the supplied pattern. MainError: 07:55:01 PM No files match the supplied pattern. 03:14:55 (3128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:14:57 (3128): No heartbeat from core client for 30 sec - exiting 03:14:58 (3128): No heartbeat from core client for 30 sec - exiting 03:14:59 (3128): No heartbeat from core client for 30 sec - exiting 03:15:00 (3128): No heartbeat from core client for 30 sec - exiting forrtl: Access is denied. Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=25772, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=25772, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=25772, iMonCtr=1 Model crash detected, will try to restart... forrtl: Access is denied. Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=25772, iMonCtr=1 Model crash detected, will try to restart... </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
05 Mar 2013 19:55:38	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	648,000	1,755,807	2.7096
04 Mar 2013 21:00:46	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	622,080	1,689,030	2.7151
03 Mar 2013 16:35:57	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	596,160	1,622,172	2.7210
02 Mar 2013 19:49:50	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	570,240	1,552,665	2.7228
01 Mar 2013 22:58:56	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	544,320	1,482,849	2.7242
01 Mar 2013 02:10:53	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	518,400	1,413,048	2.7258
28 Feb 2013 03:06:16	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	492,480	1,341,038	2.7230
27 Feb 2013 05:11:00	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	466,560	1,269,570	2.7211
26 Feb 2013 10:45:56	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	440,640	1,206,808	2.7388
25 Feb 2013 15:46:58	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	414,720	1,142,279	2.7543
24 Feb 2013 14:40:53	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	388,800	1,070,806	2.7541
23 Feb 2013 21:29:59	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	362,880	999,619	2.7547
23 Feb 2013 21:29:59	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	336,960	930,131	2.7604
21 Feb 2013 07:39:23	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	311,040	857,843	2.7580
20 Feb 2013 08:58:08	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	285,120	784,177	2.7503
19 Feb 2013 11:43:53	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	259,200	712,327	2.7482
18 Feb 2013 14:20:24	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	233,280	641,209	2.7487
17 Feb 2013 17:21:17	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	207,360	570,635	2.7519
16 Feb 2013 06:03:23	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	181,440	500,150	2.7566
15 Feb 2013 08:07:00	1091586	15548048	hadcm3n_o147_2140_40_008269339_1	155,520	428,546	2.7556