Task 15813445

Name	hadcm3n_o0hi_1940_40_008382023_0
Workunit	8532882
Created	1 Jun 2013, 3:22:41 UTC
Sent	13 Jun 2013, 22:10:01 UTC
Report deadline	13 Sep 2013, 5:37:12 UTC
Received	25 Aug 2013, 12:37:32 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1190093
Run time	8 days 20 hours 16 min 2 sec
CPU time	8 days 9 hours 12 min 38 sec
Validate state	Invalid
Credit	4,976.64
Device peak FLOPS	2.33 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> Das Gerät erkennt den Befehl nicht. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2648, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6084, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3516, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3556, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:54:26 (3796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 04:47:45 (3748): No heartbeat from core client for 30 sec - exiting 04:47:46 (3748): No heartbeat from core client for 30 sec - exiting 04:47:47 (3748): No heartbeat from core client for 30 sec - exiting 04:47:48 (3748): No heartbeat from core client for 30 sec - exiting 04:47:49 (3748): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 04:47:51 (3748): No heartbeat from core client for 30 sec - exiting 16:02:12 (3624): No heartbeat from core client for 30 sec - exiting 16:02:13 (3624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 14:36:43 (3988): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 12:17:33 (3592): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:18:21 (4168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:16:56 (3740): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:17:38 (1568): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 Model crashed: SETPOS: Unit 68 to Word Address -198 Failed with Error Code -1 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
18 Aug 2013 21:42:15	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	414,720	696,445	1.6793
14 Aug 2013 17:38:25	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	388,800	651,916	1.6767
14 Aug 2013 17:38:25	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	362,880	609,711	1.6802
14 Aug 2013 17:38:25	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	336,960	567,719	1.6848
14 Aug 2013 17:38:25	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	311,040	524,799	1.6872
14 Aug 2013 17:38:25	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	285,120	482,772	1.6932
25 Jul 2013 09:50:24	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	259,200	439,809	1.6968
23 Jul 2013 21:17:44	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	233,280	397,400	1.7035
23 Jul 2013 20:49:40	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	207,360	355,984	1.7167
26 Jun 2013 20:26:57	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	181,440	312,045	1.7198
25 Jun 2013 22:22:17	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	155,520	268,297	1.7252
22 Jun 2013 15:59:41	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	129,600	223,275	1.7228
21 Jun 2013 17:11:01	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	103,680	178,034	1.7171
18 Jun 2013 21:25:19	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	77,760	131,985	1.6973
17 Jun 2013 13:47:57	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	51,840	87,206	1.6822
15 Jun 2013 21:35:44	1190093	15813445	hadcm3n_o0hi_1940_40_008382023_0	25,920	42,878	1.6542