climateprediction.net home page
Posts by geophi

Posts by geophi

1) Message boards : Number crunching : New Work Announcements (Message 62770)
Posted 16 days ago by Profile geophi
Post:
Batch 877 HADAM4H 1680 Tasks for Linux 1536 currently showing as available on server but that was close to an hour ago so will be quite a few less by now!

And these are now 5 model month runs, instead of the 4 month runs we've had for HADAM4H (N216) up until now. So they should take about 25% longer with 25% more credits.
2) Questions and Answers : Unix/Linux : *** Running 32bit CPDN from 64bit Linux - Discussion *** (Message 62769)
Posted 17 days ago by Profile geophi
Post:
Instructions for Red Hat Enterprise Linux 7 do not work for Red Hat Enterprise Linux 8. I am not surprised. I have not found out where to find compatibility libraries.

$ sudo yum install compat-libstdc++-33.i686 compat-libstdc++-33.x86_64 zlib.i686 libstdc++.i686
[sudo] password for jeandavid8:
Updating Subscription Management repositories.
Repository 'amdgpu-pro-local' is missing name in configuration, using id.
amdgpu-pro-local 2.8 MB/s | 2.9 kB 00:00
Red Hat Enterprise Linux 8 for x86_64 - AppStream (RPMs) 3.8 kB/s | 4.5 kB 00:01
Red Hat Enterprise Linux 8 for x86_64 - BaseOS (RPMs) 3.5 kB/s | 4.0 kB 00:01
No match for argument: compat-libstdc++-33.i686
No match for argument: compat-libstdc++-33.x86_64
Error: Unable to find a match

Try

sudo yum -y install libstdc++.i686 libnsl.i686

also, zlib.i686 and/or zlib-devel.i686 for hadcm3s file upload creation

If anything else is missing, perhaps some other suggestions for 32 bit libraries could be found in
https://www.ibm.com/support/pages/how-configure-red-hat-enterprise-linux-8-run-rational-clearcase#arch-x86-cc32
3) Questions and Answers : Unix/Linux : Building from source on 20.04 (Message 62699)
Posted 9 Sep 2020 by Profile geophi
Post:
Or just do what I do and download the berkeley version at https://boinc.berkeley.edu/dl/boinc_ubuntu_7.16.6_x86_64-pc-linux-gnu.sh

It says 7.16.6 on the download but in Help > About it says 7.17.0 It says development version but I've had no issues with it after using it for a few months.

I got sick of the service install of boinc in the Ubuntu package manager as I like the ease of control when running the berkeley packaged version.
4) Message boards : Number crunching : Big models (Message 62652)
Posted 9 Aug 2020 by Profile geophi
Post:
Will these larger models also use more RAM? If so can we get a hint as to the numbers we should expect?

Nope, same as the current hadam4h N216 models, about 1.4 GB per task.
5) Message boards : Number crunching : Big models (Message 62648)
Posted 8 Aug 2020 by Profile geophi
Post:
Don't get me wrong. I have a great faith in each and every one of you as a person. It's just that, as a species, I think we a dumber than a bag of hammers.


LOL. I've read this a lot recently (in various forms), and I couldn't agree more. This is especially evident in recent years with the proliferation of many so-called "news" media sources and social media.
6) Message boards : Number crunching : Big models (Message 62643)
Posted 7 Aug 2020 by Profile geophi
Post:
These are the hadam4h N216 models that have been the main batches we've been running lately on Linux. The ones we've been running on the main site have model month uploads of ~145 MB. In the newly tested version, the model month uploads will be ~195 MB, so about 35% more per upload.
7) Message boards : Number crunching : Climate tasks keep running while BOINC is in paused mode (Message 62631)
Posted 31 Jul 2020 by Profile geophi
Post:
There's not a lot out there discussing 7.15.0, but Richard Haselgrove, boinc expert on the general boinc forums had this to say about odd numbered builds and 7.15.0 last year...

The numbering policy is:

Even numbers are used for releases
Odd numbers are used for development work, not yet ready for release.

If you see v7.15.0 out in the wild, it's a private build, made for testing and possibly with modifications not intended for general use.

The next release branch is 7.16.1, and that can be selected under 'branches' in GitHub: I've made a private Windows build for testing, but I don't know of anybody else admitting to doing that.
8) Message boards : Number crunching : Climate tasks keep running while BOINC is in paused mode (Message 62629)
Posted 29 Jul 2020 by Profile geophi
Post:
I have never seen this behavior in the many tasks I've run. A couple questions:

    What version of boinc are you running?
    How are you pausing BOINC? Suspend in the Activity menu? Suspend in the Projects tab? Suspend in the Tasks tab?

9) Message boards : climateprediction.net Science : AFlame PROJECT (Message 62627)
Posted 25 Jul 2020 by Profile geophi
Post:
I just meant that if one were to automate the task of finding computers crashing tasks due to 32 bit library deficiency, the message in stderr would be a good way to do it.
10) Message boards : climateprediction.net Science : AFlame PROJECT (Message 62625)
Posted 24 Jul 2020 by Profile geophi
Post:
Does the answer lie in the participant database? Computers without the libraries will fail the tsaks with a short CPU run time - maybe only a few seconds. Would it be possible to automatically block a computer that fails more than say 4 tasks within this time and message the owner to check his machine?

Perhaps even easier, stderr.txt has a pretty set error message for the lack of 32bit libraries. New tasks being returned with that error message in stderr would flag that computer. Some or all of this perhaps

error while loading shared libraries: libstdc++.so.6: cannot open shared object file


It wouldn't get all 32bit library errors, but the vast majority of them.
11) Questions and Answers : Windows : Intel Visual Fortan run-time error (Message 62602)
Posted 1 Jul 2020 by Profile geophi
Post:
Now I am a bit of a pickle as I have already aborted (RESET) all tasks and they are still being reported as "In Progress" on the website, with no way of aborting.

Any ideas?

Antony

If you want those tasks to be reissued, you need to "Remove" climateprediction.net on the Projects tab of boinc manager. Then reattach that PC to climateprediction.net. At that point, those tasks will show up as Abandoned on the task listing for the PC you detached, and tasks will be sent out from that work unit to someone else's PCs. Once they show up as Abandoned, you can merge the new PC with the old one listed and it will retain the task listing and credits of the previously removed PC.

Of course if you are currently running a climateprediction.net task, you should set No New Tasks and wait for it to finish before detaching and reattaching.
12) Message boards : Number crunching : New work Discussion (Message 62583)
Posted 14 Jun 2020 by Profile geophi
Post:
Is batch 871 still open? I just received 2 Wu’s (both _2s). They have failed on 2 other machines. They must be 3 or 4 months old. Are they still needed an wanted.

Batch 871 is still open. It was issued on May 22nd 2020.
13) Message boards : Number crunching : Uploading files fails (Message 62521)
Posted 27 May 2020 by Profile geophi
Post:
Similar to Iain, two tasks have finished on my i7 4770. However, none of the files have uploaded. Some will look to be 100% in the transfers tab for awhile, but won't complete. Sixty eight files waiting to upload.

I reported such to the appropriate people on the project.
14) Message boards : Number crunching : Credits (Message 62507)
Posted 26 May 2020 by Profile geophi
Post:
Data Export and Credits seem to be stuck again. Data export files are showing dates of 5 days ago.

db_dump.xml 2020-05-20 00:30 749
host.gz 2020-05-20 00:30 2.5M
tables.xml 2020-05-20 00:30 3.4K
team.gz 2020-05-20 00:30 773K
user.gz 2020-05-20 00:30 114K

Can your send and email Les ?

Thanks
Bill Freauff

The credits are calculated once a week on Wednesday or Thursday I believe. I think that's when the data export files are updated.
15) Message boards : Number crunching : Uploading files fails (Message 62476)
Posted 22 May 2020 by Profile geophi
Post:
And the error message is?

Trying to upload zips to upload4...


5/22/2020 3:34:53 PM | climateprediction.net | Started upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip
5/22/2020 3:36:38 PM | climateprediction.net | Started upload of wah2_anz50_310v_209212_32_872_012026315_0_r2048446855_1.zip
5/22/2020 3:40:01 PM | | Project communication failed: attempting access to reference site
5/22/2020 3:40:01 PM | climateprediction.net | Temporarily failed upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip: transient HTTP error
5/22/2020 3:40:01 PM | climateprediction.net | Backing off 00:02:50 on upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip
5/22/2020 3:40:02 PM | | Internet access OK - project servers may be temporarily down.
5/22/2020 3:42:53 PM | climateprediction.net | Started upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip
5/22/2020 3:47:59 PM | climateprediction.net | Temporarily failed upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip: transient HTTP error
5/22/2020 3:47:59 PM | climateprediction.net | Backing off 00:07:19 on upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip
5/22/2020 3:48:00 PM | | Project communication failed: attempting access to reference site
5/22/2020 3:48:02 PM | | Internet access OK - project servers may be temporarily down.
5/22/2020 3:52:06 PM | climateprediction.net | [error] Error reported by file upload server: EOF on socket read : asked for 262144, got 163840
5/22/2020 3:52:06 PM | climateprediction.net | Temporarily failed upload of wah2_anz50_310v_209212_32_872_012026315_0_r2048446855_1.zip: transient upload error
5/22/2020 3:52:06 PM | climateprediction.net | Backing off 00:03:07 on upload of wah2_anz50_310v_209212_32_872_012026315_0_r2048446855_1.zip

It now uploads the 33 MB files to 100% but doesn't complete.
16) Message boards : Number crunching : New Work Announcements (Message 62459)
Posted 22 May 2020 by Profile geophi
Post:
Another batch with 3,150 more Australia & New Zealand models have gone out, at 50 km resolution and covering 32 months. That makes 9450 work units issued total for that sector in the last day.
17) Message boards : Number crunching : "No tasks sent" (Message 62409)
Posted 7 May 2020 by Profile geophi
Post:
Currently getting this response under Linux and BOINC 7.16.6. Status page says there are 1744 jobs available. No other relevant-looking messages.

Were there similar messages before you upgraded to Ubuntu 20.04?

Did you upgrade over 19.10 or did you do a clean install of 20.04?
18) Message boards : Number crunching : New Model Type HadAM4 (Message 62394)
Posted 3 May 2020 by Profile geophi
Post:
Not enough memory for a computer that size, either.

He said he's only running a couple at a time though. If he was trying to run 32, or even 16 that would be a whole different matter.
19) Message boards : Number crunching : Work available and being requested but none downloaded (Message 62391)
Posted 3 May 2020 by Profile geophi
Post:
A separate symptom that might, at a stretch, be related? Normally I can find the results of the weekly credit run on Thursday morning and it hits BAM on Friday morning. So far this week I still cannot see these results - is there a problem?

Sometimes a script crashes, or doesn't get restarted after a server reboot. I e-mailed Andy about the credit thing.

Looks like the credit script ran in the last day. Stats should be updated on cpdn now. Not sure when the credit sites pick up that stats from cpdn.
20) Message boards : Number crunching : New Model Type HadAM4 (Message 62390)
Posted 3 May 2020 by Profile geophi
Post:
I have several errors: "Model crashed: READDUMP: BAD BUFFIN OF DATA". The Wus have been quite advanced.

Like this one: https://www.cpdn.org/result.php?resultid=21924162

I have limited the climateprediction to two concurrent WUs on this computer, so I was wondering if there is a cure.

It looks like the problems started in early to mid April. Did anything change on that PC or the environment it's in during that time frame?

Some of the crashes, and even some of those that said they completed successfully, had negative theta errors in stderr.txt. While that is sometimes a problem with the initial conditions or parameters for a given task or set of tasks, it can also indicate some hardware instability. If it's in a particularly dusty, or warm environment, that could cause some problems and a thorough cleaning and checking that good air flow through the system is occurring might remove that possibility. Or perhaps CPU, memory and hard disk integrity checking software could be run to determine if any obvious errors are evident? Just a shot in the dark here as I'm not certain it is a hardware/cooling issue but checking those things would at least remove them as possibilities for the problems.


Next 20

©2020 climateprediction.net