climateprediction.net home page
Posts by computezrmle

Posts by computezrmle

1) Message boards : Number crunching : Top participants RAC (Message 70668)
Posted 24 Mar 2024 by computezrmle
Post:
Not a script (sorry) but a C++ program:
https://github.com/BOINC/boinc/blob/master/sched/update_stats.cpp

And here is how to use it:
https://boinc.berkeley.edu/trac/wiki/ProjectTasks
2) Message boards : Number crunching : Top participants RAC (Message 70666)
Posted 23 Mar 2024 by computezrmle
Post:
Just thinking that if so, then a request for a change in the server code over at git-hub might be in order?

The server packet includes a script that updates RAC from inactive users/hosts/teams following the same method that is used when an active user/host/team reports work.
Project admins just need to run it periodically.
The suggestion is once a day.
3) Message boards : Number crunching : Batch 996 Weather@Home2 East Asia25 (Message 69810)
Posted 13 Oct 2023 by computezrmle
Post:
IIRC the default TCP connection timeout on Windows is 120 s.
Long enough to establish a connection here.

The WAN connections to upload7.cpdn.org look fine.
Some basic tests from Germany:
traceroute -I upload7.cpdn.org
traceroute to upload7.cpdn.org (141.223.16.156), 30 hops max, 60 byte packets
 1  private_property (private_property)  0.282 ms  0.269 ms  0.266 ms
 2  private_property (private_property)  1.556 ms  1.895 ms  2.181 ms
 3  private_property.dip0.t-ipconnect.de (private_property)  24.707 ms  24.761 ms  24.848 ms
 4  pao-sb3-i.PAO.US.NET.DTAG.DE (62.154.5.242)  161.816 ms  161.997 ms  164.830 ms
 5  62.159.61.206 (62.159.61.206)  164.827 ms  164.979 ms  166.109 ms
 6  112.174.87.49 (112.174.87.49)  741.361 ms * *
 7  112.174.86.197 (112.174.86.197)  288.144 ms  287.256 ms  286.934 ms
 8  112.174.48.105 (112.174.48.105)  288.141 ms  277.725 ms  277.695 ms
 9  * * *
10  210.223.242.98 (210.223.242.98)  278.568 ms  278.164 ms  278.150 ms
11  141.223.180.2 (141.223.180.2)  280.832 ms  280.447 ms  280.268 ms
12  141.223.253.60 (141.223.253.60)  281.137 ms  281.841 ms  284.276 ms
13  141.223.16.93 (141.223.16.93)  281.398 ms  281.122 ms  281.101 ms
14  eawah.postech.ac.kr (141.223.16.156)  283.307 ms  281.804 ms  282.373 ms



Immediate reply from the HTTP server on the host:
time nc -zvw 5 upload7.cpdn.org 80
Connection to upload7.cpdn.org 80 port [tcp/http] succeeded!

real    0m0,285s
user    0m0,001s
sys     0m0,000s



HTTPS (port 443) on upload7.cpdn.org is closed
time nc -zvw 5 upload7.cpdn.org 443
nc: connect to upload7.cpdn.org port 443 (tcp) failed: Connection refused

real    0m0,285s
user    0m0,001s
sys     0m0,000s



What I would check if not already done:
1. The TCP parameters like connection timeout, connection idle timeout, total number of allowed connections ...
1.1 on the host running the HTTP server
1.2 on all backend systems behind (1.1), e.g. DB hosts ...
1.2 on the local router(s)
2. IDS settings - if any, typically on the router(s)
4) Questions and Answers : Unix/Linux : Communicating with BOINC client. Please wait (Message 69385)
Posted 21 Jul 2023 by computezrmle
Post:
My user name was missing from the boinc group, so i added it. Immediately afterwards ...

Did you logoff/logon or reboot after the change?
5) Questions and Answers : Windows : boinc 7.22.2 won't 'Add Project->CPDN' under Wine (or any project) (Message 69305)
Posted 13 Jul 2023 by computezrmle
Post:
... boincmgr doesn't show the 'Add Project' option (it's grayed out).

Did you attach that BOINC client via an account manager?
To get back full control you would need to detach from the account manager.
6) Message boards : Number crunching : New work discussion - 2 (Message 69076)
Posted 2 Jul 2023 by computezrmle
Post:
if not, the client waits for it until the timeout is over.


Why?

Basically (in short) because a blank line indicates a "transfer complete" in HTTP.


In addition, "100-continue" was added to HTTP 1.1 after the initial spec.

Some more information can be found here including a link to the relevant RFC:
https://daniel.haxx.se/blog/2020/02/27/expect-tweaks-in-curl/

That sounds like a bug - is it a client bug or a server bug?

I would start at the server to ensure it sends a blank line.
You may notice blank lines in other parts of the logs (from google but even from the CPDN server).
7) Message boards : Number crunching : New work discussion - 2 (Message 69074)
Posted 2 Jul 2023 by computezrmle
Post:
The problem is this:
7/1/2023 3:20:31 PM | climateprediction.net | [http] [ID#5] Received header from server: HTTP/1.1 100 Continue

You may check if the server is configured to add '\r\n\r\n' (a blank line) at the end of that header.
If not, the client waits for it until the timeout is over.



geophi wrote:
... the boinc executable is running in wine ...

Don't know if this modifies the network packets, e.g. removes the expected blank line.
8) Message boards : Number crunching : New work discussion - 2 (Message 69067)
Posted 1 Jul 2023 by computezrmle
Post:
May I ask whether the clients are connected via a Squid Proxy?
If so this may explain the following HTTP header:
... [http] [ID#5725] Received header from server: HTTP/1.1 100 Continue


On the Client side this can be solved adding this lines to squid.conf:
# may be a workaround for POST issues
client_request_buffer_max_size 512 MB

Then reload the configuration, e.g. with:
sudo squid -k reconfigure

On Windows open the Squid console as Administrator and run:
squid -k reconfigure


If the client is not configured to use a Squid the server's POST handling may need to be checked, especially if a size limit is set.
9) Message boards : Number crunching : New work discussion - 2 (Message 69063)
Posted 1 Jul 2023 by computezrmle
Post:
"This is not a P3P policy!"

Oh not that again.... The EU making it's mark on the world without officially declaring war.

Nonsense.

This message comes from contacting Google used as test site after the connection to the original server failed.
Google then violates a W3C standard as described here by Microsoft in 2012:
https://web.archive.org/web/20130316093939/http://blogs.msdn.com/b/ie/archive/2012/02/20/google-bypassing-user-privacy-settings.aspx
10) Questions and Answers : Windows : No tasks (Message 68959)
Posted 24 Jun 2023 by computezrmle
Post:
Checked the task list for this PC and found that it has been locked out for years. I have completely re-built it since then. Can someone please unlock it?

Why don't you create a new computer account for this computer and try to merge the old one into the new one later?


Independent from that:
Your Linux clients run BOINC v7.18.1 which is an Android only release.
Using it on Linux is known to cause lots of problems, hence must not be used.
See:
https://boinc.berkeley.edu/forum_thread.php?id=15014&postid=112098
https://boinc.berkeley.edu/forum_thread.php?id=15014&postid=112100
11) Message boards : Number crunching : Credit handed out weekly? (Message 68827)
Posted 31 May 2023 by computezrmle
Post:
Even if a project shows updated credits (users, hosts, teams) on it's own website, external stats pages usually try to download them from certain locations.
In case of CPDN this should be:
https://www.cpdn.org/stats/user.gz
https://www.cpdn.org/stats/host.gz
https://www.cpdn.org/stats/team.gz

What you get here are 3 files with this content:
bad URL

It looks like that reply is generated on the fly by a script.
If so, you may need to copy the real stats files to the correct server location and disable the script.
12) Message boards : Number crunching : How big a task can I run? (Message 68820)
Posted 26 May 2023 by computezrmle
Post:
The free command indeed shows the sum of all swap spaces which can be dedicated partitions and swapfiles.
On modern Linux systems the old rule to set as much (or more) swap as RAM is likely to be obsolete, except for special use cases like hybernation or huge DBs - but it might not be a good idea to run a heavy BOINC app on such a DB system.

From the performance perspective it is usually much better to invest in more RAM.

Swap usage can be tuned via "vm.swappiness=[value]" in "/etc/sysctl.conf".
The default value is usually set to 60 and you can find suggestions to increase it as well as to decrease it.
I personally prefer to set it to 1 or even 0 (avoid swapping as much as possible) on a system running BOINC and having enough RAM.
This works fine and fast as long as you regularly monitor your system status and plan your project mix.

Other users may prefer other settings.
13) Message boards : Number crunching : Download issues (Message 68688)
Posted 21 Apr 2023 by computezrmle
Post:
BOINC's client_state.xml mentions this URL being one of the download URLs:
<download_url>http://www.cpdn.org/cpdnboinc/applications//hadam4_8.09_i686-pc-linux-gnu</download_url>


Tried to download it using wget:
DEBUG output created by Wget 1.21.3 on linux-gnu.

--2023-04-21 09:15:32--  http://www.cpdn.org/cpdnboinc/applications//hadam4_8.09_i686-pc-linux-gnu
Resolving www.cpdn.org (www.cpdn.org)... 129.67.193.131
Caching www.cpdn.org => 129.67.193.131
Connecting to www.cpdn.org (www.cpdn.org)|129.67.193.131|:80... Releasing 0x000055b2e98266a0 (new refcount 0).
Deleting unused 0x000055b2e98266a0.
connected.
Created socket 4.
Releasing 0x000055b2e9820820 (new refcount 1).

---request begin---
GET /cpdnboinc/applications//hadam4_8.09_i686-pc-linux-gnu HTTP/1.1
Host: www.cpdn.org
User-Agent: Wget/1.21.3
Accept: */*
Accept-Encoding: identity
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... 
---response begin---
HTTP/1.1 301 Moved Permanently
Date: Fri, 21 Apr 2023 07:15:32 GMT
Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.2.34
Location: https://www.cpdn.org/cpdnboinc/applications/hadam4_8.09_i686-pc-linux-gnu
Content-Length: 281
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1

---response end---
301 Moved Permanently
Registered socket 4 for persistent reuse.
URI content encoding = ‘iso-8859-1’
Location: https://www.cpdn.org/cpdnboinc/applications/hadam4_8.09_i686-pc-linux-gnu [following]
Skipping 281 bytes of body: [<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="https://www.cpdn.org/cpdnboinc/applications/hadam4_8.09_i686-pc-linux-gnu">here</a>.</p>
</body></html>
] done.
URI content encoding = None
--2023-04-21 09:15:32--  https://www.cpdn.org/cpdnboinc/applications/hadam4_8.09_i686-pc-linux-gnu
Found www.cpdn.org in host_name_addresses_map (0x55b2e9820820)
Connecting to www.cpdn.org (www.cpdn.org)|129.67.193.131|:443... connected.
Created socket 5.
Releasing 0x000055b2e9820820 (new refcount 1).
Initiating SSL handshake.
Handshake successful; connected socket 5 to SSL handle 0x000055b2e9964460
certificate:
  subject: CN=www.cpdn.org
  issuer:  CN=R3,O=Let's Encrypt,C=US
X509 certificate successfully verified and matches host www.cpdn.org

---request begin---
GET /cpdnboinc/applications/hadam4_8.09_i686-pc-linux-gnu HTTP/1.1
Host: www.cpdn.org
User-Agent: Wget/1.21.3
Accept: */*
Accept-Encoding: identity
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... 
---response begin---
HTTP/1.1 404 Not Found
Date: Fri, 21 Apr 2023 07:15:32 GMT
Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.2.34
Content-Length: 250
Connection: close
Content-Type: text/html; charset=iso-8859-1

---response end---
404 Not Found
URI content encoding = ‘iso-8859-1’
Closed 5/SSL 0x000055b2e9964460
2023-04-21 09:15:32 ERROR 404: Not Found.






BTW, the same happens to oifs:
--2023-04-21 09:27:40--  https://www.cpdn.org/applications//oifs_43r3_bl_app_1.11_x86_64-pc-linux-gnu.zip
Resolving www.cpdn.org (www.cpdn.org)... 129.67.193.131
Connecting to www.cpdn.org (www.cpdn.org)|129.67.193.131|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2023-04-21 09:27:40 ERROR 404: Not Found.
14) Message boards : Number crunching : Download issues (Message 68682)
Posted 20 Apr 2023 by computezrmle
Post:
Also "got" a task with download issues a couple of days ago.
These are the corresponding lines from my proxy log:
[16/Apr/2023:07:15:59 +0200] "GET http://download.cpdn.org/download//batch_932/ancils/VOLC38_LR_N216.gz HTTP/1.1" 404 813 "-" "BOINC client (x86_64-suse-linux-gnu 7.21.0)" TCP_MISS:HIER_DIRECT
[16/Apr/2023:07:15:59 +0200] "GET http://download.cpdn.org/download//batch_932/ancils/oxi.addfa.N216L38.gz HTTP/1.1" 404 819 "-" "BOINC client (x86_64-suse-linux-gnu 7.21.0)" TCP_MISS:HIER_DIRECT
[16/Apr/2023:07:15:59 +0200] "GET http://download.cpdn.org/download//batch_932/ancils/ozone_rcp45_N216L38_1999_2010v2.gz HTTP/1.1" 404 847 "-" "BOINC client (x86_64-suse-linux-gnu 7.21.0)" TCP_MISS:HIER_DIRECT
[16/Apr/2023:07:15:59 +0200] "GET http://download.cpdn.org/download//batch_932/ancils/ic_N216_2002_12_000015_f.nc.gz HTTP/1.1" 404 839 "-" "BOINC client (x86_64-suse-linux-gnu 7.21.0)" TCP_MISS:HIER_DIRECT
[16/Apr/2023:07:15:59 +0200] "GET http://download.cpdn.org/download//batch_932/ancils/ancil_PAMIP_tos_fut2CArctic_N216-clim.gz HTTP/1.1" 404 859 "-" "BOINC client (x86_64-suse-linux-gnu 7.21.0)" TCP_MISS:HIER_DIRECT
[16/Apr/2023:07:15:59 +0200] "GET http://download.cpdn.org/download//batch_932/ancils/ancil_PAMIP_siconc_fut2CArctic_N216-clim.gz HTTP/1.1" 404 865 "-" "BOINC client (x86_64-suse-linux-gnu 7.21.0)" TCP_MISS:HIER_DIRECT
[16/Apr/2023:07:15:59 +0200] "GET http://download.cpdn.org/download//batch_932/ancils/so2dms_rcp45_N216_1999_2010.gz HTTP/1.1" 404 839 "-" "BOINC client (x86_64-suse-linux-gnu 7.21.0)" TCP_MISS:HIER_DIRECT
[16/Apr/2023:07:15:59 +0200] "GET http://download.cpdn.org/download//batch_932/ancils/a15j_932_atmos.gz HTTP/1.1" 404 813 "-" "BOINC client (x86_64-suse-linux-gnu 7.21.0)" TCP_MISS:HIER_DIRECT
[16/Apr/2023:07:15:59 +0200] "GET http://download.cpdn.org/download//batch_932/workunits/hadam4h_a15j_200011_5_932_012142507.zip HTTP/1.1" 404 905 "-" "BOINC client (x86_64-suse-linux-gnu 7.21.0)" TCP_MISS:HIER_DIRECT



It's definitely not a certificate issue since HTTP is used.
Instead the files are simply not where they should be (HTTP status 404).
15) Message boards : Number crunching : How is daily quota calculated? (Message 68459)
Posted 25 Feb 2023 by computezrmle
Post:
The BOINC manual and the source code answer that question:
https://boinc.berkeley.edu/trac/wiki/ProjectOptions#Joblimits

<daily_result_quota> N </daily_result_quota>
Each host has a field MRD in the interval [1 .. daily_result_quota]; it's initially daily_result_quota, and is adjusted as the host sends good or bad results. The maximum number of jobs sent to a given host in a 24-hour period is MRD*(NCPUS + GM*NGPUS). You can use this to limit the impact of faulty hosts.


https://github.com/BOINC/boinc/blob/4655fce0b47940e2dacd4269daf70b16904120fe/sched/sched_result.cpp#L50-L53
https://github.com/BOINC/boinc/blob/1e51e98e93eb68cd3f56e84e53e3270ce2cbc96c/sched/sched_send.cpp#L190-L217
16) Message boards : News : New study going out to volunteer's machines (Message 68371)
Posted 18 Feb 2023 by computezrmle
Post:
Hopefully the VBox environment has a unique IP...

There's no need for hope.
Just read the VirtualBox manual.
https://www.virtualbox.org/manual/ch06.html#networkingmodes

Hint1: Don't mix self created VMs with those created by vboxwrapper
Hint2: Your use case requires a self created VM with Bridged networking
17) Message boards : Number crunching : OpenIFS Discussion (Message 68364)
Posted 17 Feb 2023 by computezrmle
Post:
As far as a simple RAM monitoring shows oifs tasks seem to have a large but (most of the time) stable RAM requirement while a step is in progress.
At the beginning of each step there's a small additional peak and at the end of a step lots of RAM is released for a short time until the next step starts.

That short release might be responsible for BOINC client's intermediate miscalculation regarding the RAM estimation and finally lead to an overcommitment.

To make BOINC more stable the idea would be to check within oifs whether the maximum RAM requirement can be estimated before the task allocates it from the heap.
It might then be possible to reuse the same RAM until all steps are processed.
18) Message boards : Number crunching : The uploads are stuck (Message 68055)
Posted 26 Jan 2023 by computezrmle
Post:
Last night my oifs upload backlog could be cleared and all finished tasks could successfully be reported.
ATM all tasks in progress can upload their trickles before the next one appears.

One guy less fighting for an upload slot.
19) Message boards : Number crunching : How to Prevent OpenIFS Download (Message 67994)
Posted 23 Jan 2023 by computezrmle
Post:
You may also need to upgrade your BOINC client to at least 7.20.x since older versions suffer from a bug related to the 'max_concurrent' options.
See:
https://github.com/BOINC/boinc/pull/4592
20) Message boards : Number crunching : w/u failed at the 89th zip file (Message 67962)
Posted 22 Jan 2023 by computezrmle
Post:
double free or corruption (...)


Like many other pages this one explains what usually causes that error and what to do to avoid it:
https://linuxhint.com/double-free-corruption-error/

Looks like the code of the scientific app needs to be revised to ensure correct pointer assignment.


Next 20

©2024 climateprediction.net