1)
Questions and Answers :
Windows :
Intel Visual Fortan run-time error
(Message 56542)
Posted 22 Jul 2017 by skgiven Post: Was getting Visual Fortran run-time errors for a hadcm3s task. System: Win 10 x64 system with an i7-3770K @3GHz & two GTX970's. Stable system. Was only running one climate model. Other tasks include WCG, TN_Grid, PG, GCC (NCI) & GPUGrid. CPU not saturated.
forrtl: severe (24): end-of-file during read, unit 5, file C:\BOINC\projects\climateprediction.net\hadcm3s_a036_203412_120_599_011122128\jobs\climate.cpdc, line 873, position 0 Image ... hadcm3s_um_8.34_w ... Stack trace terminated abnormally.
|
2)
Message boards :
Number crunching :
Is there a problem with uploader1.atm.ox.ac.uk?
(Message 46724)
Posted 28 Jul 2013 by skgiven Post: I'm not able to upload a completed WU. From Boinc's log: 28/07/2013 16:10:28 | climateprediction.net | update requested by user 28/07/2013 16:10:34 | climateprediction.net | Sending scheduler request: Requested by user. 28/07/2013 16:10:34 | climateprediction.net | Sending trickle-up message 28/07/2013 16:10:34 | climateprediction.net | Reporting 1 completed tasks 28/07/2013 16:10:34 | climateprediction.net | Not requesting tasks: "no new tasks" requested via Manager 28/07/2013 16:10:35 | climateprediction.net | Scheduler request failed: HTTP internal server error More detailed logs (with some flags set): 28/07/2013 16:31:29 | climateprediction.net | Sending scheduler request: Requested by user. 28/07/2013 16:31:29 | climateprediction.net | Sending trickle-up message 28/07/2013 16:31:29 | climateprediction.net | Reporting 1 completed tasks 28/07/2013 16:31:29 | climateprediction.net | Not requesting tasks: "no new tasks" requested via Manager 28/07/2013 16:31:29 | climateprediction.net | [http] HTTP_OP::init_post(): http://climateapps2.oerc.ox.ac.uk/cpdnboinc_cgi/cgi 28/07/2013 16:31:29 | climateprediction.net | [http] HTTP_OP::libcurl_exec(): ca-bundle set 28/07/2013 16:31:29 | | [poll] CLIENT_STATE::poll_slow_events(): scheduler_rpc 28/07/2013 16:31:29 | | [poll] CLIENT_STATE::do_something(): End poll: 1 tasks active 28/07/2013 16:31:29 | | [suspend] net_susp 0 file_xfer_susp 0 reason 0 28/07/2013 16:31:29 | | [poll] CLIENT_STATE::do_something(): End poll: 0 tasks active 28/07/2013 16:31:29 | climateprediction.net | [http] [ID#1] Info: Connection #0 seems to be dead! 28/07/2013 16:31:29 | climateprediction.net | [http] [ID#1] Info: Closing connection #0 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Info: About to connect() to climateapps2.oerc.ox.ac.uk port 80 (#0) 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Info: Trying 129.67.194.243... 28/07/2013 16:31:30 | | [suspend] net_susp 0 file_xfer_susp 0 reason 0 28/07/2013 16:31:30 | | [poll] CLIENT_STATE::do_something(): End poll: 0 tasks active 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Info: Connected to climateapps2.oerc.ox.ac.uk (129.67.194.243) port 80 (#0) 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Info: Connected to climateapps2.oerc.ox.ac.uk (129.67.194.243) port 80 (#0) 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Sent header to server: POST /cpdnboinc_cgi/cgi HTTP/1.1 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.2.5) 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Sent header to server: Host: climateapps2.oerc.ox.ac.uk 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Sent header to server: Accept: */* 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Sent header to server: Accept-Encoding: deflate, gzip 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Sent header to server: Content-Type: application/x-www-form-urlencoded 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Sent header to server: Content-Length: 78430 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Sent header to server: Expect: 100-continue 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Sent header to server: 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Received header from server: HTTP/1.1 100 Continue 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Received header from server: HTTP/1.1 500 Internal Server Error 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Received header from server: Date: Sun, 28 Jul 2013 15:31:31 GMT 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Received header from server: Server: Apache 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Received header from server: Content-Length: 623 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Received header from server: Connection: close 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Received header from server: Content-Type: text/html; charset=iso-8859-1 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Received header from server: 28/07/2013 16:31:30 | climateprediction.net | [http] [ID#1] Info: Closing connection #0 28/07/2013 16:31:31 | | [suspend] net_susp 0 file_xfer_susp 0 reason 0 28/07/2013 16:31:31 | climateprediction.net | Scheduler request failed: HTTP internal server error 28/07/2013 16:31:31 | | [poll] CLIENT_STATE::poll_slow_events(): scheduler_rpc 28/07/2013 16:31:31 | | [poll] CLIENT_STATE::do_something(): End poll: 1 tasks active 28/07/2013 16:31:31 | | [suspend] net_susp 0 file_xfer_susp 0 reason 0 28/07/2013 16:31:31 | | [poll] CLIENT_STATE::do_something(): End poll: 0 tasks active 28/07/2013 16:31:32 | | [suspend] net_susp 0 file_xfer_susp 0 reason 0 28/07/2013 16:31:32 | | [poll] CLIENT_STATE::do_something(): End poll: 0 tasks active 28/07/2013 16:31:33 | | [suspend] net_susp 0 file_xfer_susp 0 reason 0 28/07/2013 16:31:33 | | [poll] CLIENT_STATE::do_something(): End poll: 0 tasks active 28/07/2013 16:31:34 | | [suspend] net_susp 0 file_xfer_susp 0 reason 0 28/07/2013 16:31:34 | | [poll] CLIENT_STATE::do_something(): End poll: 0 tasks active 28/07/2013 16:31:35 | | [suspend] net_susp 0 file_xfer_susp 0 reason 0 28/07/2013 16:31:35 | | [poll] CLIENT_STATE::do_something(): End poll: 0 tasks active 28/07/2013 16:31:36 | | [suspend] net_susp 0 file_xfer_susp 0 reason 0 28/07/2013 16:31:36 | climateprediction.net | [prio] -0.000000 rsf 0.000998 rt 0.000000 rs 82160.834155 - Probably OT but I'm also seeing an Apache server issue when I go to the usermap, http://climateapps2.oerc.ox.ac.uk/cpdnboinc/usermap.php Internal Server Error The server encountered an internal error or misconfiguration and was unable to complete your request. Please contact the server administrator, cpdn-sysadmin@oerc.ox.ac.uk and inform them of the time the error occurred, and anything you might have done that may have caused the error. More information about this error may be available in the server error log. Apache Server at climateapps2.oerc.ox.ac.uk Port 80 - PS BBcode buttons not working :|| |
3)
Message boards :
Number crunching :
Hyperthreading
(Message 46240)
Posted 16 May 2013 by skgiven Post: Interesting results. It's a bit less than I would have hoped for but I wouldn't have expected much more. The more threads you use the less efficient the extra threads become and this project is quite heavy. You would only do 8.6% more work running 8 models than running 4 models, and running 8 is only 3.4% more productive than 6. You might actually find that running 7 climate models is about the same as 8, possibly better (as has been documented at other projects). A few considerations: I don't know what your clocks are at, but they depreciate when you run more WU's, from 3.9GHz to 3.5GHz at stock clocks. Which is why I fix mine at 4.2GHz. Memory and drive I/O contention would mean competition for resources. The system always uses some resources, even in Best Performance mode and with Aero off... It's probably the case that memory speeds impact on the results here, and more-so with higher numbers of WU's in use. You really need to compensate for using dual channel by getting fast memory modules. Fortunately 2400 is fairly inexpensive these days. What frequency is your CPU running at, your memory modules and are you using an SSD? Of note is that Climate models actually use more electric than many projects: i7-3770K @ 4.2GHz, 8GB 2133 RAM, SATA6 SSD drive System usage 77W (includes 2 GPU's idle power of 7W each) With no GPU tasks running, 1 CPU BoincSimap WU�s � System usage 91W 2 CPU BoincSimap WU�s � System usage 104W 1 CPU Ibercivis WU�s � System usage 93W 2 CPU Ibercivis WU�s � System usage 106W 1 CPU Climate WU�s � System usage 95W 2 CPU Climate WU�s � System usage 112W This has always been the case.
|
4)
Message boards :
Number crunching :
WORTH THE TROUBLE????
(Message 46191)
Posted 11 May 2013 by skgiven Post: Latest batch looking good so far. Seems to be a good bunch to download. Still a bunch out there to download. Server says no:
Tasks ready to send 0
|
5)
Message boards :
Number crunching :
Reporting - Errors while computing -
(Message 45659)
Posted 15 Mar 2013 by skgiven Post: hadcm3n_zl88_1960_40_008321064_0 8472199 24 Feb 2013 17:06:24 UTC 15 Mar 2013 0:48:30 UTC Completed 788,994.61 786,033.00 --- 11,819.52 UK Met Office Coupled Model Full Resolution Ocean v6.07 hadcm3n_3i54_1980_40_008320817_0 8471952 24 Feb 2013 16:05:43 UTC 15 Mar 2013 4:45:39 UTC Completed 794,894.29 597,384.40 --- 11,508.48 UK Met Office Coupled Model Full Resolution Ocean v6.07 hadcm3n_3g30_1980_40_008320815_0 8471950 24 Feb 2013 16:05:43 UTC 2 Mar 2013 13:22:18 UTC Error while computing 451,201.48 393,044.10 6,220.80 6,220.80 UK Met Office Coupled Model Full Resolution Ocean v6.07 hadcm3n_3msh_1980_40_008320807_0 8471942 24 Feb 2013 16:05:43 UTC 4 Mar 2013 0:17:33 UTC Error while computing 517,030.49 312,994.20 6,842.88 6,842.88 UK Met Office Coupled Model Full Resolution Ocean v6.07 hadcm3n_zfuw_1920_40_008320605_0 8471740 24 Feb 2013 15:04:21 UTC 4 Mar 2013 0:18:10 UTC Error while computing 484,250.75 452,475.50 6,842.88 6,842.88 UK Met Office Coupled Model Full Resolution Ocean v6.07 hadcm3n_4jjh_1940_40_008303591_1 8454726 23 Feb 2013 15:31:44 UTC 4 Mar 2013 0:18:10 UTC Error while computing 556,161.88 417,148.50 8,087.04 8,087.04 UK Met Office Coupled Model Full Resolution Ocean v6.07 I noticed a Windows pop-up Error with these. Basically it's asking do you want to close the app! The two models that complete also encountered these, but I exited from Boinc, then closed the Error message, restarted the system and the WU's completed, eventually - Any chance we could get a Boinc setting to allow tasks to continuously run until they complete? Trying to run 7 or 8 models probably isn't the wisest so I run other projects when crunching for climate, but Boinc keeps jumping from project to project, even with a low cache and switch between apps set to 999min. Sorry, too many model crashes! :-( Boinc-wide I'm seeing a big increase in task failures. Something I attribute to Windows. So are these crashes related to the app or Windows? PS. 5.10 is only required for domain controllers; it's not needed for member servers. (DC's don't have local accounts, used by subsequent Boinc versions). |
6)
Message boards :
Number crunching :
Output file absent & Too many errors (may have bug)
(Message 44591)
Posted 26 Jul 2012 by skgiven Post: Some details from different systems: Task 14973021 Name hadam3p_eu_634j_2009_1_008071304_2 Workunit 8226418 Created 22 Jul 2012 0:43:29 UTC Sent 22 Jul 2012 0:47:15 UTC Received 22 Jul 2012 10:30:11 UTC Server state Over Outcome Client error Client state Compute error Exit status 0 (0x0) Computer ID 1212547 Report deadline 4 Jul 2013 6:07:15 UTC Run time 26,180.15 CPU time 25,922.02 Validate state Invalid Claimed credit 200.38 Granted credit 200.38 application version UK Met Office HADAM3P European Region v6.09 Stderr show hide <core_client_version>7.0.28</core_client_version> <![CDATA[ <stderr_txt> Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH tmp/xaakm.pipe_dummy 2048 Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_2.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_3.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_4.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_5.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> Name hadam3p_eu_2j5d_1987_1_008071308_1 Workunit 8226422 Created 20 Jul 2012 7:01:50 UTC Sent 20 Jul 2012 7:52:01 UTC Received 21 Jul 2012 8:19:10 UTC Server state Over Outcome Client error Client state Compute error Exit status 0 (0x0) Computer ID 1126062 Report deadline 2 Jul 2013 13:12:01 UTC Run time 13,805.54 CPU time 13,678.24 Validate state Invalid Claimed credit 0.00 Granted credit 0.00 application version UK Met Office HADAM3P European Region v6.09 Stderr show hide <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Signal 15 received, exiting... Called boinc_finish Signal 15 received, exiting... Called boinc_finish Signal 15 received, exiting... Called boinc_finish SIGSEGV: segmentation violation Stack trace (14 frames): /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x836e1cf] [0xf0f87400] /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu[0x8136129] /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu[0x813c074] /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu[0x8131c87] /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu[0x813d6aa] /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu[0x8133fca] /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu[0x8078e6f] /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu[0x82d73ae] /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu[0x82f8867] /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu[0x82f14bb] /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu[0x82f97f6] /lib32/libc.so.6(__libc_start_main+0xe5)[0xf0df342d] /home/aida/BOINC/projects/climateprediction.net/hadam3p_eu_um_6.09_i686-pc-linux-gnu[0x804caf1] Exiting... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3708, selfPID=3695, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_1.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_2.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_3.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_4.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_5.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2j5d_1987_1_008071308_1_13.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> Name hadam3p_eu_60t3_2009_1_008071305_0 Workunit 8226419 Created 20 Jul 2012 5:56:54 UTC Sent 20 Jul 2012 6:02:06 UTC Received 22 Jul 2012 3:45:28 UTC Server state Over Outcome Client error Client state Compute error Exit status 0 (0x0) Computer ID 1192477 Report deadline 2 Jul 2013 11:22:06 UTC Run time 74,050.46 CPU time 72,651.55 Validate state Invalid Claimed credit 200.38 Granted credit 200.38 application version UK Met Office HADAM3P European Region v6.09 Stderr show hide <core_client_version>6.12.34</core_client_version> <![CDATA[ <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH tmp/xaakm.pipe_dummy 2048 Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_60t3_2009_1_008071305_0_2.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_60t3_2009_1_008071305_0_3.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_60t3_2009_1_008071305_0_4.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_60t3_2009_1_008071305_0_5.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_60t3_2009_1_008071305_0_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_60t3_2009_1_008071305_0_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_60t3_2009_1_008071305_0_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_60t3_2009_1_008071305_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_60t3_2009_1_008071305_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_60t3_2009_1_008071305_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_60t3_2009_1_008071305_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> Name hadam3p_eu_634j_2009_1_008071304_2 Workunit 8226418 Created 22 Jul 2012 0:43:29 UTC Sent 22 Jul 2012 0:47:15 UTC Received 22 Jul 2012 10:30:11 UTC Server state Over Outcome Client error Client state Compute error Exit status 0 (0x0) Computer ID 1212547 Report deadline 4 Jul 2013 6:07:15 UTC Run time 26,180.15 CPU time 25,922.02 Validate state Invalid Claimed credit 200.38 Granted credit 200.38 application version UK Met Office HADAM3P European Region v6.09 Stderr show hide <core_client_version>7.0.28</core_client_version> <![CDATA[ <stderr_txt> Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH tmp/xaakm.pipe_dummy 2048 Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_2.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_3.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_4.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_5.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> Name hadam3p_eu_634j_2009_1_008071304_1 Workunit 8226418 Created 21 Jul 2012 5:03:17 UTC Sent 21 Jul 2012 5:11:11 UTC Received 22 Jul 2012 0:43:28 UTC Server state Over Outcome Client error Client state Compute error Exit status 0 (0x0) Computer ID 1221572 Report deadline 3 Jul 2013 10:31:11 UTC Run time 54,671.36 CPU time 54,503.55 Validate state Invalid Claimed credit 200.38 Granted credit 200.38 application version UK Met Office HADAM3P European Region v6.09 Stderr show hide <core_client_version>7.0.25</core_client_version> <![CDATA[ <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH tmp/xaakm.pipe_dummy 2048 Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_1_2.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_1_3.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_1_4.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_1_5.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_1_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_1_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_1_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_1_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_1_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_1_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_1_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> Name hadam3p_eu_6c44_2009_1_008071303_0 Workunit 8226417 Created 20 Jul 2012 5:56:29 UTC Sent 20 Jul 2012 6:01:45 UTC Received 21 Jul 2012 1:04:09 UTC Server state Over Outcome Client error Client state Compute error Exit status 0 (0x0) Computer ID 915051 Report deadline 2 Jul 2013 11:21:45 UTC Run time 47,264.36 CPU time 46,751.77 Validate state Invalid Claimed credit 200.38 Granted credit 200.38 application version UK Met Office HADAM3P European Region v6.09 Stderr show hide <core_client_version>7.0.28</core_client_version> <![CDATA[ <stderr_txt> Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH tmp/xaakm.pipe_dummy 2048 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4524, selfPID=4524, iMonCtr=2 Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_6c44_2009_1_008071303_0_2.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_6c44_2009_1_008071303_0_3.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_6c44_2009_1_008071303_0_4.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_6c44_2009_1_008071303_0_5.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_6c44_2009_1_008071303_0_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_6c44_2009_1_008071303_0_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_6c44_2009_1_008071303_0_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_6c44_2009_1_008071303_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_6c44_2009_1_008071303_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_6c44_2009_1_008071303_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_6c44_2009_1_008071303_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
7)
Message boards :
Number crunching :
Output file absent & Too many errors (may have bug)
(Message 44574)
Posted 24 Jul 2012 by skgiven Post: Paolo's Hadam EU tasks on that computer are all crashing with an exit status of -2: Outcome Client error Client state Compute error Exit status -2 (0xfffffffffffffffe) I think this is an issue with the task or app and nothing to do with Windows, Boinc, manager or client or other apps. Some of Paolo's other computers are failing due to the REPLANCA issue with Exit status 0, error_code -161 (file_xfer_error): Exit status 0 (0x0) Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH Some of these don't seem to run (Error while downloading) but others do run (file_xfer_error): 14947580 8213025 19 Jul 2012 17:57:28 UTC 21 Jul 2012 3:16:32 UTC Error while computing 102,580.61 100,825.80 399.11 399.11 UK Met Office HADAM3P European Region v6.09 In this case could the trickle result in a failure (file_xfer_error) and this in turn cause the task to be killed, and could all this be linked to the servers availability/responsiveness (pages not loading)? - More likely one of the ranges is out! |
8)
Message boards :
Number crunching :
Output file absent & Too many errors (may have bug)
(Message 44568)
Posted 22 Jul 2012 by skgiven Post: From WU 8226400 to 8226430 there are 15 failed tasks, several have failed more than once, none have reported successfully. All are UK Met Office HADAM3P European Region and all were created at around the same time (20 Jul 2012 5:50:00 to 5:59:00 UTC) http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=8226418 http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=8226422 http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=8226419 http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=8226418 http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=8226417 |
9)
Message boards :
Number crunching :
Output file absent & Too many errors (may have bug)
(Message 44564)
Posted 22 Jul 2012 by skgiven Post: Thanks for the confirmation. This sort of issue occurs at other projects too, usually when the researchers make a mistake when building the tasks, but was also caused by deprecated clients for auto-generated tasks. Might it be possible/worth while to do an early trickle point, or add a file check routine, in order to reduce the loss in such situations; so they would fail earlier, rather than say after 10h? |
10)
Message boards :
Number crunching :
Output file absent & Too many errors (may have bug)
(Message 44562)
Posted 22 Jul 2012 by skgiven Post: Output file absent: 22/07/2012 10:38:50 | climateprediction.net | Computation for task hadam3p_eu_634j_2009_1_008071304_2 finished 22/07/2012 10:38:50 | climateprediction.net | Output file hadam3p_eu_634j_2009_1_008071304_2_2.zip for task hadam3p_eu_634j_2009_1_008071304_2 absent 22/07/2012 10:38:50 | climateprediction.net | Output file hadam3p_eu_634j_2009_1_008071304_2_3.zip for task hadam3p_eu_634j_2009_1_008071304_2 absent 22/07/2012 10:38:50 | climateprediction.net | Output file hadam3p_eu_634j_2009_1_008071304_2_4.zip for task hadam3p_eu_634j_2009_1_008071304_2 absent 22/07/2012 10:38:50 | climateprediction.net | Output file hadam3p_eu_634j_2009_1_008071304_2_5.zip for task hadam3p_eu_634j_2009_1_008071304_2 absent 22/07/2012 10:38:50 | climateprediction.net | Output file hadam3p_eu_634j_2009_1_008071304_2_6.zip for task hadam3p_eu_634j_2009_1_008071304_2 absent 22/07/2012 10:38:50 | climateprediction.net | Output file hadam3p_eu_634j_2009_1_008071304_2_7.zip for task hadam3p_eu_634j_2009_1_008071304_2 absent 22/07/2012 10:38:50 | climateprediction.net | Output file hadam3p_eu_634j_2009_1_008071304_2_8.zip for task hadam3p_eu_634j_2009_1_008071304_2 absent 22/07/2012 10:38:50 | climateprediction.net | Output file hadam3p_eu_634j_2009_1_008071304_2_9.zip for task hadam3p_eu_634j_2009_1_008071304_2 absent 22/07/2012 10:38:50 | climateprediction.net | Output file hadam3p_eu_634j_2009_1_008071304_2_10.zip for task hadam3p_eu_634j_2009_1_008071304_2 absent 22/07/2012 10:38:50 | climateprediction.net | Output file hadam3p_eu_634j_2009_1_008071304_2_11.zip for task hadam3p_eu_634j_2009_1_008071304_2 absent 22/07/2012 10:38:50 | climateprediction.net | Output file hadam3p_eu_634j_2009_1_008071304_2_12.zip for task hadam3p_eu_634j_2009_1_008071304_2 absent 14973021 8226418 1212547 22 Jul 2012 0:47:15 UTC 22 Jul 2012 10:30:11 UTC Error while computing 26,180.15 25,922.02 0.00 --- UK Met Office HADAM3P European Region v6.09 <core_client_version>7.0.28</core_client_version> <![CDATA[ <stderr_txt> Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH tmp/xaakm.pipe_dummy 2048 Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_2.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_3.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_4.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_5.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_6.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_634j_2009_1_008071304_2_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> -161 is a File Not Found error. My System. My Task The WorkUnit Notes. The Ethernet to Internet connection was disconnected at the time. Also running POEM (GPU), RNA world and yoyo tasks. Only 4 CPU threads used (due to POEM requirements/setup). Write to disk @900sec. No other system or Boinc issues. |
11)
Message boards :
Number crunching :
Project has no tasks available
(Message 44539)
Posted 20 Jul 2012 by skgiven Post: I keep getting permanent "HTTP error" when it downloads a new file. I have now received this error twice and the project has to be aborted. Permanent HTTP error: 20/07/2012 13:20:50 | climateprediction.net | Sending scheduler request: To fetch work. 20/07/2012 13:20:50 | climateprediction.net | Requesting new tasks for CPU 20/07/2012 13:20:52 | climateprediction.net | Scheduler request completed: got 1 new tasks 20/07/2012 13:20:54 | climateprediction.net | Started download of hadcm3n_o6zx_2060_40_007999460.zip 20/07/2012 13:20:54 | climateprediction.net | Started download of ocean_o6zx_2060_40_007999460_0.gz 20/07/2012 13:20:56 | climateprediction.net | Finished download of hadcm3n_o6zx_2060_40_007999460.zip 20/07/2012 13:20:56 | climateprediction.net | Giving up on download of ocean_o6zx_2060_40_007999460_0.gz: permanent HTTP error 20/07/2012 13:20:56 | climateprediction.net | Started download of atmos_o6zx_2060_40_007999460_0.gz 20/07/2012 13:20:57 | climateprediction.net | Giving up on download of atmos_o6zx_2060_40_007999460_0.gz: permanent HTTP error |
12)
Message boards :
Number crunching :
Every last HADAM3P European Region ends in computation error
(Message 43695)
Posted 22 Jan 2012 by skgiven Post: Thanks Les. I aborted the task sitting at 100%. The error message disappeared. Closed and opened Boinc again and the error message did not return. Might be worth knowing. |
13)
Message boards :
Number crunching :
Every last HADAM3P European Region ends in computation error
(Message 43693)
Posted 22 Jan 2012 by skgiven Post: On 2 systems had similar problems (Win Server 2008 and 2003 server): On the 2003 server there was a popup error message,
Runtime Error! Program:E:\BOINC\projects... This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. [OK]
|
14)
Message boards :
Number crunching :
CPDN Credit Not Getting Out?
(Message 43467)
Posted 23 Nov 2011 by skgiven Post: On the 1/10 chance of hadcm3n tasks: I must have been very lucky then; on my last request for work (about 2weeks ago) I only got hadcm3n tasks! Just asked for 5 on one system and all 5 were hadcm3n. One has completed, one is at 68% (running) and the others are between 20 and 40% - Now just running between 2 and 3 at a time; the returned task lost 14h CPU - runtime delta. It's not a particularly well CP optimized system, yet, but that will hopefully change for many in due course... 'Are we there yet' time is around 313h for each task. i7-2600 8GB, only using 2nd drive for Boinc. GL |
15)
Message boards :
Number crunching :
Download error
(Message 43235)
Posted 16 Oct 2011 by skgiven Post: Both files that were previously not downloading on my system have now downloaded. Thanks, |
16)
Message boards :
Number crunching :
Download error
(Message 43224)
Posted 15 Oct 2011 by skgiven Post: Thought it might be reasonable to try <http_1_0>1</http_1_0> as that potential solution worked on the 13th, http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=7308. It didn't work so I wanted to report what I could (time since this problem started), might help someone fix the problem. Don't know if this is different form the OP or not, but slightly different error messages. Still seeing the HTTP Errors and 403 Forbidden, but at 12.53 (UK Time) 4 tasks succeeded in downloading and are now running. The files that still have not downloaded are, so2dns_N96_2001_12_2003_02f_1900rescale.gz 03_N96_pers_1959_1998_2011.gz |
17)
Message boards :
Number crunching :
Download error
(Message 43221)
Posted 15 Oct 2011 by skgiven Post: I have download problems, from yesterday evening (UK time, 12h or so): 6 tasks are still in the process of downloading. 6 Files are still downloading (retried many times) 03_N96_pers_1959_1998_2011.gz so2dns_N96_2002_12_2004_02f_1900rescale.gz so2dns_N96_2004_12_2006_02f_1900rescale.gz so2dns_N96_2000_12_2002_02f_1900rescale.gz so2dns_N96_2001_12_2003_02f_1900rescale.gz so2dns_N96_2003_12_2005_02f_1900rescale.gz There appears to be no server problems on the server status page. I can download tasks from other projects. 15/10/2011 11:44:47 climateprediction.net Started download of so2dms_N96_2004_12_2006_02f_1900rescale.gz 15/10/2011 11:44:49 climateprediction.net Temporarily failed download of so2dms_N96_2004_12_2006_02f_1900rescale.gz: HTTP error 15/10/2011 11:44:49 climateprediction.net Backing off 1 hr 16 min 14 sec on download of so2dms_N96_2004_12_2006_02f_1900rescale.gz Tried the cc_config.xml option, <http_1_0>1</http_1_0>, but no success, 15/10/2011 11:50:52 climateprediction.net Started download of so2dms_N96_2004_12_2006_02f_1900rescale.gz 15/10/2011 11:50:54 climateprediction.net Temporarily failed download of so2dms_N96_2004_12_2006_02f_1900rescale.gz: HTTP error 15/10/2011 11:50:54 climateprediction.net Backing off 54 min 44 sec on download of so2dms_N96_2004_12_2006_02f_1900rescale.gz 15/10/2011 11:51:02 climateprediction.net Started download of so2dms_N96_2000_12_2002_02f_1900rescale.gz 15/10/2011 11:51:03 climateprediction.net Temporarily failed download of so2dms_N96_2000_12_2002_02f_1900rescale.gz: HTTP error 15/10/2011 11:51:03 climateprediction.net Backing off 1 hr 34 min 47 sec on download of so2dms_N96_2000_12_2002_02f_1900rescale.gz Tried a project reset, but the servers want Boinc to backoff for an hour (must be as sick of me I as I of them). Going back to HTTP 1.1 |
18)
Message boards :
Number crunching :
99.529%
(Message 43127)
Posted 2 Oct 2011 by skgiven Post: I've had a task sitting at 99.529% for some time; at least 12h. Is this common, or a known problem. i7-2600 HT on (Win x64), only 7threads in use. Thanks, |
19)
Message boards :
Number crunching :
Credits accruing but no trickles sent????
(Message 41413)
Posted 1 Jan 2011 by skgiven Post: Some tasks reported back today - good to see progress. Pending credit is now only 48,493 (down from 66,441), but total credit only went up by under 1000 since 5 days ago. RAC is around 1K per day. This WU alone should get 7K. Hopefully it's just a matter of updating the credit database. |
20)
Message boards :
Number crunching :
Performance on hadcm3igeo "coupled" models
(Message 41366)
Posted 27 Dec 2010 by skgiven Post: I’m mostly running FAMOUS tasks on the i7 and I guess that will remain the case for some time, so I will keep it on XP x86 for now and if I upgrade, in a month or two, I will just go for a 64bit server. My en route system will also get some win 64bit server. I'm guessing that as with most tasks x64 is slightly faster here too. For me this is all good news. Many thanks. |
©2024 climateprediction.net