climateprediction.net home page
Posts by Ingleside

Posts by Ingleside

1) Message boards : Number crunching : WaH batches 996 & 1001 have been closed (Message 70604)
Posted 5 Mar 2024 by Ingleside
Post:
WaH batches 996 & 1001 have been closed
What does this mean in practice?
Does it mean continue crunching will just be a waste of electricity, since anything returned is just dumped?
Or is it still useful to continue crunching these until they either finish or crap-out on next re-boot?
2) Message boards : Number crunching : New work discussion - 2 (Message 69853)
Posted 14 Oct 2023 by Ingleside
Post:
For whatever reason (possibly a bad Boinc design), Windows does not allow CPDN to shut down completely.
The problem here is, even with exiting BOINC beforehand some of the models still crapped-out on re-start.
3) Message boards : Number crunching : Batch 996 Weather@Home2 East Asia25 (Message 69826)
Posted 13 Oct 2023 by Ingleside
Post:
I have increased this in the past using a text editor.

With the high crap-out rate of these models if you make the mistake of exiting BOINC you can't really increase the disk limit of already running tasks.

Pausing in memory the problematic tasks before hitting disk limit due to stuck uploads should at least in theory work.
4) Message boards : Number crunching : Batch 996 Weather@Home2 East Asia25 (Message 69768)
Posted 11 Oct 2023 by Ingleside
Post:
Since I apparently overlooked a Windows 10 update, 15 tasks crapped out after the unexpected re-boot.

14 errored-out with "Signal 11 received: Segment violation" but one of them strangely enough also had "The system cannot find the drive specified. (0xf) - exit code 15 (0xf)"

One of them had "The access code is invalid. (0xc) - exit code 12 (0xc)"

All of them had at least 1 trickle, meaning it's not wu's that errored-out at the initial startup.
5) Message boards : Number crunching : New work discussion - 2 (Message 69559)
Posted 2 Sep 2023 by Ingleside
Post:
Would it make any difference if they were multicore?

I would much rather run a single let's say 10 GB model for one week using 4 cores, instead of running the same on a single core for 4 weeks.
With how really bad CPDN is at handling re-boots, while waiting to re-boot a few days for the 4 cores to finish is possible, waiting maybe 3 weeks for the single core to finish isn't really practical.
6) Message boards : Number crunching : Credit Question Answered (Message 69338)
Posted 16 Jul 2023 by Ingleside
Post:
it's currently a dreadful technical debt that small teams like CPDN have to find resources to manage, but it has no benefit for the scientists using the system.
For most BOINC projects the "technical debt" is to remember to make the stats-dumps available for stats sites, where "everything else" is already automatically handled then results are validated.

As for "no benefit for the scientists", I would expect it's a huge "benefit" for an individual scientist to not have to run all the models on his/her own computer(s), since chances are with no "credit" very few other users would want to waste their time running CPDN.
7) Message boards : Number crunching : New work discussion - 2 (Message 69068)
Posted 1 Jul 2023 by Ingleside
Post:
I think that's under Linux?
Since WAH2 according to apps-page is Windows exclusive I doubt a Linux computer would try to return such work.
Still, in the off-chance it's some kind of beta-wu, a quick look on geophi's log shows
6/30/2023 1:45:18 PM | climateprediction.net | [http] [ID#21] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.22.2)

Similarly, pututu's log shows
6/30/2023 11:15:43 AM | climateprediction.net | [http] [ID#5725] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.16.20)
8) Message boards : Number crunching : New work discussion - 2 (Message 69031)
Posted 28 Jun 2023 by Ingleside
Post:
Because the graph shows that we hit maximum throughput when running the same number of tasks as cores (not threads)
Well, at least to my eyes it seems "Tasks completed/day" increases from 1 - 7 running tasks, where at 6 running tasks the blue point is below the 14-grid-line while 7, 8 and 9 the 14-grid-line seems to be hidden behind the blue line & dots. For 10 running tasks the blue line dips a little again.
Now if it's 7, 8 or 9 tasks that is the highest tasks completed/day I can't really see from the graph.
The increase is much steeper going from 1 - 6 running tasks than from 6 - 7 running tasks, making it easy to overlook the small increase.

Now if the small increase is significant enough to run more than 1 task / core I can't really say.
9) Message boards : Number crunching : Server Status page questions (Message 68602)
Posted 18 Mar 2023 by Ingleside
Post:
Looking on both CPDN and Rosetta@home's status pages, it seems only applications with "active" users shows any information about 100 last results. Meaning, until someone example return HADAM4 model no information will be displayed for HADAM4.

As for "in progress", at least with Rosetta@home's 3-days-deadline this will very quickly drop to zero then available work dries out for any of the applications. With CPDN multi-months deadlines things does go much slower.

For "ghosts", many projects does re-issue any "lost" work, example if server sends work but connection craps-out before BOINC client gets the work, but at least back in the day CPDN did not use this server-option. Meaning, unless CPDN have finially started using this server option, it wouldn't be surprising if nearly all work is "ghosts".

BTW, since CPDN stopped showing new trickles and stopped giving credit for non-OpenIFS work back in November or December 2022, it's also possible a larger-than-normal number of users has just quit running CPDN.
10) Message boards : Number crunching : no credit awarded? (Message 68553)
Posted 3 Mar 2023 by Ingleside
Post:
Note that the role of the BBC in promoting the early, pre-BOINC, stage of CPDN's life has escaped David's notice.
While I did run the pre-BOINC CPDN client, I can't remember BBC mentioned here, but then again it's roughly 20 years ago. The "special" BBC CPDN experiment that started in 2006 on the other hand did use BOINC.

BTW, now maybe my recollection is too fuzzy, but after the BBC experiment shut-down, didn't once-upon-a-time these BBC credits show-up here as a separate field on individual user's pages? I just checked and didn't see such a field.
11) Message boards : Number crunching : no credit awarded? (Message 68549)
Posted 3 Mar 2023 by Ingleside
Post:
The original system had two scripts - one to copy the trickles to a place where they could be seen on the website

This script, or whatever was supposed to replace this script, clearly isn't working as seen with the "No trickle!" on website.

Based on the 11. August 2022 batch of WAH2 work, since trickles did work in August but not in December (then original issue errored-out), it doesn't look like any mis-configuration of the actual wu.
Instead, some possibilities includes:
1: Trickle script can't copy to directory, due to accidentally write-protected directory or directory physically full or "full quota" or accidentally lost access rights.
2: The ini-file responsible for where trickle-script should copy trickles was changed to point to new directory, but neither web-pages or credit-script was updated to new directory.
3: Trickle-script stuck on a specific trickle and even if re-started get stuck on the same "bad" trickle.
4: Updated or re-configured BOINC server and "forget" to extract trickle information from scheduler, or extract to "wrong" directory from where trickle script expect.
5: Since apparently OpenIFS does not rely on trickles for crediting, incorrectly assumed didn't need to copy trickles any longer.

Note, chances are then the problem with trickles not showing-up on web-page is fixed the credit will also be fixed on next credit run (unless overlooked the example where "recent" trickles does show on web-page but still no credit).
12) Message boards : Number crunching : no credit awarded? (Message 68542)
Posted 2 Mar 2023 by Ingleside
Post:
it's affecting the Hadley models on all hosts

Not just Hadleys, it also affects WAH2 models running on Windows, example https://www.cpdn.org/result.php?resultid=22250721
from December 2022. As is common, it says "No trickles!".
13) Message boards : Number crunching : Download server (Message 58800)
Posted 23 Sep 2018 by Ingleside
Post:
The count decreases then it's assigned to you, since obviously the same task can't be issued to someone else. Also the scheduling-server doesn't know, and doesn't care, if the download-server is working or not.
14) Message boards : Number crunching : Africa v7.22 Errors (Message 51317)
Posted 26 Jan 2015 by Ingleside
Post:
Hmm, does the Africa-application have the same fatal bug as the PNW-application, with landing in the "crashes 100 times before giving-up", or is it something else going-on with just some bad wu's?
15) Message boards : Number crunching : Main climateprediction.net web page down (Message 51131)
Posted 5 Jan 2015 by Ingleside
Post:
"Attaching isn't a problem"
Oops
for users already attached, who have an account_* file somewhere, yes "Attaching isn't a problem"

But, for new clients, who don't have such, yes it is a problem.

Makes it hard to recruit new (and old) users when they need an "account_*" file on their system already to sign up -

Maybe some kind of magic redirection from the public sites -- I have no clue.

New users can't sign up right now -- is that correct?


Well, I can't at the moment test if it's possible to make a new account, since I'm stuck at a Linux-computer with no possibility to run BOINC...

For everyone that already has an acccount, manages to find these pages and also manages to login their own account on these web-pages, it's fairly easy to make an account-file if for some reason it's not possible to just attach to http://climateapps2.oerc.ox.ac.uk/cpdnboinc/ by manually typing this into BOINC Manager...

Goto http://climateapps2.oerc.ox.ac.uk/cpdnboinc/weak_auth.php and use notepad (or similar) to make, as marked in bold, a file called account_climateprediction.net.xml in your BOINC data-directory.

For the contents, at the moment you'll need to use http://climateapps2.oerc.ox.ac.uk/cpdnboinc/ for PROJECT_URL and for WEAK_ACCOUNT_KEY you can use either the weak account-key listed on the web-page or the "normal" account-key listed on the "Your account"-page.


After successfully editing and saving account_climateprediction.net.xml, just re-start BOINC.
16) Message boards : Number crunching : Main climateprediction.net web page down (Message 51118)
Posted 4 Jan 2015 by Ingleside
Post:
Has anyone else been able to attach since the the front page went offline? If so how?

Attaching isn't a problem, you can use http://climateapps2.oerc.ox.ac.uk/cpdnboinc/ to attach since they've also including the link to scheduler on this page. While the BOINC-client will complain about using this address it will still work without any problems. I've for many years been attached to the "wrong" url for a project without having any problems appart for getting the "wrong address" on scheduler-requests.

For anyone already attached but is stuck with needing to re-download master-page, a work-around is to edit the account_climateprediction.net.xml-file in the BOINC data-directory to use http://climateapps2.oerc.ox.ac.uk/cpdnboinc/ as the master-url (the 2nd. line in the file) and afterwards re-start the BOINC client.

Then the homepage is finally back, it is just to re-edit the account-file to use http://climateprediction.net/ as master_url. This can be done even if you attached to the "wrong" url originally.


BTW, while attaching isn't difficult to do, I immediately hit the Linux-bug of "missing libstdc++.so.6", a library so ancient it's not part of the repositories any longer meaning I've not got the foggiest idea how to install it.
17) Message boards : Number crunching : Compute Errors on Pacific North West v7.22 Tasks (Message 49593)
Posted 18 Jul 2014 by Ingleside
Post:
Last PNW to come to my machine was on 12th Feb this year. It completed.

Ok, I forgot to specify it's all the Windows-PNW-tasks crapping-out, under different OS like Linux this batch is possibly worse since this time it's an input-file-error while not sure on the source of error for the "no heartbeat"-tasks.
18) Message boards : Number crunching : Compute Errors on Pacific North West v7.22 Tasks (Message 49590)
Posted 18 Jul 2014 by Ingleside
Post:

That's because of the INITTIME error, as mentioned a few posts down.

All PNW-models now crapping-out after 30 seconds or something with a INITTIME-error is a huge improvement since the previous batches...

... since these ran-through 100 re-starts due to "no heartbeat" before crapping-out and as a "bonus" left-behind around 300 MB of garbage on the hd.

Frankly, AFAIK PNW haven't worked since the upgrade to 7.22, a version AFAIK not even beta-tested before release so I've no idea why CPDN continues releasing new PNW-garbage before they've even tried to get it working as beta.



19) Message boards : Number crunching : Project keeps resetting - any explanations? (Message 49371)
Posted 16 Jun 2014 by Ingleside
Post:
Anyway, the disk value has been typical for this machine for quite sometime, which is why I thought it was "normal." So -- I have one last question: Is it easiest to simply try a project "reset" to clean out the directory? I would dislike damaging the directory structure the way I go about wiping folders and disks... Thanks!

Reset should work.
20) Questions and Answers : Macintosh : GB added to my Time Machine backup (Message 48959)
Posted 29 Apr 2014 by Ingleside
Post:
2) Normally exclude the boinc directory tree from time capsule backups. Once a week, remove this exclusion, select "backup now" and after the backup is done, re-exclude the boinc directory tree for another week.

3) Same as 2) except suspend/shutdown boinc while the backup is taking place. (To satisfy my paranoia about doing a backup while boinc is running. )

4) Once a week suspend/shutdown boinc, copy the directory tree to a backup disk, restart boinc. The only problem of this option, is remembering to do it, and waiting around while 14 GB is being copied to the (relatively slow) backup disk so boinc activity can be resumed.

I tend to lean toward 4).

Well for hadam3p_eu-models it's a waste of time to do weekly backups, since chances are any restored backup will be of models you've already finished & reported. For hadam3p_anz the usefulness of a weekly backup is also limited so except for hadcm3n a weekly backup is mostly useless. (No idea with Moses).

A daily backup on the other hand would be much more useful. If the "Time Machine" is up to the task, I would choose option #5:

5: Exclude boinc from hourly backup. Make a separate backup-profile for BOINC, doing a daily backup of only the BOINC data-directory (including sub-directories).

If time machine can't handle #5, option #2 but done once-a-day is probably the best.


Next 20

©2024 climateprediction.net