climateprediction.net home page
Task 21104994

Task 21104994

Name wah2_sas50_qa5u_201612_13_713_011503913_0
Workunit 11503913
Created 27 Feb 2018, 17:41:30 UTC
Sent 28 Feb 2018, 6:48:57 UTC
Report deadline 10 Feb 2019, 12:08:57 UTC
Received 29 Apr 2018, 10:20:23 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1456705
Run time 6 days 17 hours 35 min 38 sec
CPU time 6 days 9 hours 17 min 56 sec
Validate state Invalid
Credit 6,099.22
Device peak FLOPS 2.29 GFLOPS
Application version Weather At Home 2 (wah2) v8.24
windows_intelx86
Peak working set size 233.98 MB
Peak swap size 197.94 MB
Peak disk usage 47.26 MB
Stderr
<core_client_version>7.8.3</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=16376, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=14548, iMonCtr=2
Model crash detected, will try to restart...
GCM : BUFFIN: C I/O Error feof - Unit 61 - Return code = 16
GCM : BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

GCM : BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
GCM : BUFFIN: C I/O Error feof - Unit 63 - Return code = 16

GCM : BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
GCM : BUFFIN: C I/O Error feof - Unit 64 - Return code = 16

Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1504, selfPID=1504, iMonCtr=1
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=29556, iMonCtr=2
Model crash detected, will try to restart...
GCM : BUFFIN: C I/O Error feof - Unit 61 - Return code = 16
GCM : BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

GCM : BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
GCM : BUFFIN: C I/O Error feof - Unit 63 - Return code = 16

GCM : BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
GCM : BUFFIN: C I/O Error feof - Unit 64 - Return code = 16

Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1004, selfPID=1004, iMonCtr=1
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1004, selfPID=32060, iMonCtr=1
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7720, selfPID=14432, iMonCtr=1
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=12252, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13348, iMonCtr=2
Model crash detected, will try to restart...
GCM : BUFFIN: C I/O Error feof - Unit 61 - Return code = 16
GCM : BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

GCM : BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
GCM : BUFFIN: C I/O Error feof - Unit 63 - Return code = 16

GCM : BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
GCM : BUFFIN: C I/O Error feof - Unit 64 - Return code = 16

Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10964, selfPID=10964, iMonCtr=1
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10528, selfPID=17948, iMonCtr=1
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3208, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17108, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=23112, selfPID=18368, iMonCtr=1
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4752, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=640, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=20772, selfPID=20772, iMonCtr=1
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=26420, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=26592, selfPID=26236, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=28536, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=26904, selfPID=7972, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=12748, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=29036, iMonCtr=2
Model crash detected, will try to restart...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=30908, selfPID=32160, iMonCtr=1
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=32444, selfPID=33536, iMonCtr=1
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=17300, selfPID=17300, iMonCtr=1
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=14844, selfPID=14844, iMonCtr=1
Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=14488, iMonCtr=1
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=0, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8224, selfPID=14392, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_ain::Monitor...
07:49:12 (14392): called boinc_finish(0)

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>wah2_sas50_qa5u_201612_13_713_011503913_0_r1835200947_9.zip</file_name>
  <error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>wah2_sas50_qa5u_201612_13_713_011503913_0_r1835200947_10.zip</file_name>
  <error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>wah2_sas50_qa5u_201612_13_713_011503913_0_r1835200947_11.zip</file_name>
  <error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>wah2_sas50_qa5u_201612_13_713_011503913_0_r1835200947_12.zip</file_name>
  <error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>wah2_sas50_qa5u_201612_13_713_011503913_0_r1835200947_13.zip</file_name>
  <error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>wah2_sas50_qa5u_201612_13_713_011503913_0_r1835200947_restart.zip</file_name>
  <error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
21 Apr 2018 21:11:13 1456705 21104994 wah2_sas50_qa5u_201612_13_713_011503913_0 92,459 524,864 5.6767
21 Apr 2018 21:11:13 1456705 21104994 wah2_sas50_qa5u_201612_13_713_011503913_0 80,939 458,346 5.6629
05 Mar 2018 15:06:27 1456705 21104994 wah2_sas50_qa5u_201612_13_713_011503913_0 23,339 130,658 5.5983
02 Mar 2018 10:09:41 1456705 21104994 wah2_sas50_qa5u_201612_13_713_011503913_0 11,819 65,203 5.5168


©2024 climateprediction.net