climateprediction.net home page
Task 11447739

Task 11447739

Name famous_u51q_1599_200_006639385_3
Workunit 6842757
Created 10 Jun 2010, 11:58:05 UTC
Sent 25 Jul 2010, 14:10:34 UTC
Report deadline 24 Oct 2010, 21:37:45 UTC
Received 11 Sep 2010, 15:15:56 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1297049
Run time 6 days 12 hours 2 min 59 sec
CPU time 6 days 13 hours 53 min 41 sec
Validate state Invalid
Credit 2,810.31
Device peak FLOPS 1.69 GFLOPS
Application version UK Met Office FAMOUS v6.11
i686-pc-linux-gnu
Stderr
<core_client_version>6.10.56</core_client_version>
<![CDATA[
<message>
process exited with code 22 (0x16, -234)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (19894): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=21911, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=21911, iMonCtr=1
Model crash detected, will try to restart...
 (23041): Can't acquire lockfile (-154) - waiting 35s
 (21911): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=23041, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=23041, iMonCtr=1
Model crash detected, will try to restart...
 (24492): Can't acquire lockfile (-154) - waiting 35s
 (23041): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=24492, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=24492, iMonCtr=1
Model crash detected, will try to restart...
 (25909): Can't acquire lockfile (-154) - waiting 35s
 (24492): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=25909, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=25909, iMonCtr=1
Model crash detected, will try to restart...
 (28699): Can't acquire lockfile (-154) - waiting 35s
 (25909): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (28699): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (8088): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9739, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9739, iMonCtr=1
Model crash detected, will try to restart...
 (11090): Can't acquire lockfile (-154) - waiting 35s
 (9739): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (11090): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13062, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13062, iMonCtr=1
Model crash detected, will try to restart...
 (13062): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17640, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17640, iMonCtr=1
Model crash detected, will try to restart...
 (25590): Can't acquire lockfile (-154) - waiting 35s
 (17640): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (25590): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (25590): No heartbeat from core client for 30 sec - exiting
 (10788): Can't acquire lockfile (-154) - waiting 35s
Suspended CPDN Monitor - Suspend request from BOINC...
 (10788): Can't acquire lockfile (-154) - exiting
 (10712): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (28018): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (7100): Can't acquire lockfile (-154) - waiting 35s
 (7100): Can't acquire lockfile (-154) - exiting
 (23780): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13885, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13885, iMonCtr=1
Model crash detected, will try to restart...
 (15432): Can't acquire lockfile (-154) - waiting 35s
 (13885): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (15432): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (2395): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (27315): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
 (27315): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
 (32277): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
 (23413): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
 (23413): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (23639): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (12777): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (17127): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (7417): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (5745): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (5805): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (5992): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (6007): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (6227): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (6252): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (6602): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (6618): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (2313): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
 (12294): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (12521): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
 (12545): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 21 - Return code = 1

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 21 - Return code = 1

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 21 - Return code = 1

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 21 - Return code = 1

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 21 - Return code = 1

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 21 - Return code = 1

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  
Sorry, too many model crashes! :-(
 (12979): called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
11 Sep 2010 13:56:58 779809 11447739 famous_u51q_1599_200_006639385_3 851,786 566,604 0.6652
11 Sep 2010 12:15:17 779809 11447739 famous_u51q_1599_200_006639385_3 842,426 560,354 0.6652
11 Sep 2010 10:31:31 779809 11447739 famous_u51q_1599_200_006639385_3 833,066 554,102 0.6651
11 Sep 2010 08:43:57 779809 11447739 famous_u51q_1599_200_006639385_3 823,706 547,884 0.6651
11 Sep 2010 07:01:27 779809 11447739 famous_u51q_1599_200_006639385_3 814,346 541,657 0.6651
11 Sep 2010 05:15:53 779809 11447739 famous_u51q_1599_200_006639385_3 804,986 535,451 0.6652
11 Sep 2010 03:34:35 779809 11447739 famous_u51q_1599_200_006639385_3 795,626 529,230 0.6652
11 Sep 2010 01:47:20 779809 11447739 famous_u51q_1599_200_006639385_3 786,266 523,020 0.6652
10 Sep 2010 23:52:55 779809 11447739 famous_u51q_1599_200_006639385_3 776,906 516,811 0.6652
10 Sep 2010 22:03:29 779809 11447739 famous_u51q_1599_200_006639385_3 767,546 510,597 0.6652
10 Sep 2010 18:39:08 779809 11447739 famous_u51q_1599_200_006639385_3 758,186 504,395 0.6653
10 Sep 2010 16:52:11 779809 11447739 famous_u51q_1599_200_006639385_3 748,826 498,175 0.6653
10 Sep 2010 15:08:53 779809 11447739 famous_u51q_1599_200_006639385_3 739,466 491,960 0.6653
10 Sep 2010 13:24:37 779809 11447739 famous_u51q_1599_200_006639385_3 730,106 485,737 0.6653
10 Sep 2010 11:40:44 779809 11447739 famous_u51q_1599_200_006639385_3 720,746 479,532 0.6653
10 Sep 2010 09:56:44 779809 11447739 famous_u51q_1599_200_006639385_3 711,386 473,326 0.6654
10 Sep 2010 08:14:28 779809 11447739 famous_u51q_1599_200_006639385_3 702,026 467,116 0.6654
10 Sep 2010 06:31:54 779809 11447739 famous_u51q_1599_200_006639385_3 692,666 460,912 0.6654
10 Sep 2010 04:46:34 779809 11447739 famous_u51q_1599_200_006639385_3 683,306 454,704 0.6654
10 Sep 2010 03:05:00 779809 11447739 famous_u51q_1599_200_006639385_3 673,946 448,506 0.6655


©2024 climateprediction.net