climateprediction.net home page
Task 15541659

Task 15541659

Name hadcm3n_o0ed_2140_40_008282550_0
Workunit 8433685
Created 14 Jan 2013, 3:02:19 UTC
Sent 14 Jan 2013, 3:02:23 UTC
Report deadline 15 Apr 2013, 10:29:34 UTC
Received 11 Feb 2013, 22:10:25 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1255354
Run time 19 days 17 hours 38 min 17 sec
CPU time 16 days 19 hours 29 min 46 sec
Validate state Invalid
Credit 9,331.20
Device peak FLOPS 2.45 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
15:54:32 (5640): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4636, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4692, iMonCtr=1
Model crash detected, will try to restart...
16:05:29 (3008): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5304, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5236, iMonCtr=1
Model crash detected, will try to restart...
12:28:54 (5264): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:31:01 (4592): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:59:55 (5704): Can't acquire lockfile (32) - waiting 35s
21:00:16 (6364): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4584, iMonCtr=1
Model crash detected, will try to restart...
12:35:43 (4500): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5520, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4152, iMonCtr=1
Model crash detected, will try to restart...
15:33:18 (4624): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4268, iMonCtr=1
Model crash detected, will try to restart...
00:41:33 (4700): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
MainError:	08:44:03 AM	No files match the supplied pattern.
MainError:	08:44:03 AM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4668, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4356, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4600, iMonCtr=1
Model crash detected, will try to restart...
MainError:	02:47:52 AM	No files match the supplied pattern.
MainError:	02:47:52 AM	No files match the supplied pattern.
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5904, iMonCtr=1
Model crash detected, will try to restart...
MainError:	03:20:45 PM	No files match the supplied pattern.
MainError:	03:20:45 PM	No files match the supplied pattern.
MainError:	05:11:25 AM	No files match the supplied pattern.
MainError:	05:11:25 AM	No files match the supplied pattern.
MainError:	09:01:56 PM	No files match the supplied pattern.
MainError:	09:01:56 PM	No files match the supplied pattern.
MainError:	12:55:46 AM	No files match the supplied pattern.
MainError:	12:55:46 AM	No files match the supplied pattern.
CPDN Monitor - Quit request from BOINC...
MainError:	04:19:55 AM	No files match the supplied pattern.
MainError:	04:19:55 AM	No files match the supplied pattern.
MainError:	08:44:09 PM	No files match the supplied pattern.
MainError:	08:44:09 PM	No files match the supplied pattern.
MainError:	12:14:13 AM	No files match the supplied pattern.
MainError:	12:14:13 AM	No files match the supplied pattern.
Suspended CPDN Monitor - Suspend request from BOINC...
MainError:	04:28:50 AM	No files match the supplied pattern.
MainError:	04:28:50 AM	No files match the supplied pattern.
Error converting file to netcdf: dataout/o0edka.ph11c10
Error converting file to netcdf: dataout/o0edka.pg11c10
Error converting file to netcdf: dataout/o0edka.pe11c10
MainError:	07:58:44 PM	No files match the supplied pattern.
MainError:	07:58:44 PM	No files match the supplied pattern.
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16

Model crashed: STWORK  : I/O error - PP fixed length header                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
11 Feb 2013 20:09:13 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 777,600 1,455,213 1.8714
11 Feb 2013 05:15:25 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 751,680 1,405,175 1.8694
10 Feb 2013 13:05:10 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 725,760 1,355,326 1.8675
09 Feb 2013 20:47:27 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 699,840 1,305,691 1.8657
09 Feb 2013 04:20:47 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 673,920 1,256,376 1.8643
08 Feb 2013 19:50:48 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 648,000 1,208,456 1.8649
07 Feb 2013 21:06:09 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 622,080 1,159,630 1.8641
07 Feb 2013 05:11:47 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 596,160 1,111,045 1.8637
06 Feb 2013 15:30:48 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 570,240 1,065,364 1.8683
04 Feb 2013 17:16:18 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 544,320 1,017,309 1.8690
03 Feb 2013 08:47:49 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 518,400 967,768 1.8668
01 Feb 2013 22:33:08 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 492,480 918,226 1.8645
01 Feb 2013 06:51:27 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 466,560 868,779 1.8621
30 Jan 2013 21:33:18 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 440,640 819,400 1.8596
29 Jan 2013 14:25:23 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 414,720 772,057 1.8616
29 Jan 2013 14:25:23 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 388,800 725,862 1.8669
29 Jan 2013 14:25:23 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 362,880 678,789 1.8706
27 Jan 2013 15:47:23 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 336,960 630,651 1.8716
26 Jan 2013 22:48:49 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 311,040 581,049 1.8681
26 Jan 2013 06:44:34 1255354 15541659 hadcm3n_o0ed_2140_40_008282550_0 285,120 532,747 1.8685


©2024 climateprediction.net