climateprediction.net home page
Posts by geophi

Posts by geophi

1) Message boards : Number crunching : Uploading files fails (Message 62521)
Posted 2 days ago by Profile geophi
Post:
Similar to Iain, two tasks have finished on my i7 4770. However, none of the files have uploaded. Some will look to be 100% in the transfers tab for awhile, but won't complete. Sixty eight files waiting to upload.

I reported such to the appropriate people on the project.
2) Message boards : Number crunching : Credits (Message 62507)
Posted 3 days ago by Profile geophi
Post:
Data Export and Credits seem to be stuck again. Data export files are showing dates of 5 days ago.

db_dump.xml 2020-05-20 00:30 749
host.gz 2020-05-20 00:30 2.5M
tables.xml 2020-05-20 00:30 3.4K
team.gz 2020-05-20 00:30 773K
user.gz 2020-05-20 00:30 114K

Can your send and email Les ?

Thanks
Bill Freauff

The credits are calculated once a week on Wednesday or Thursday I believe. I think that's when the data export files are updated.
3) Message boards : Number crunching : Uploading files fails (Message 62476)
Posted 7 days ago by Profile geophi
Post:
And the error message is?

Trying to upload zips to upload4...


5/22/2020 3:34:53 PM | climateprediction.net | Started upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip
5/22/2020 3:36:38 PM | climateprediction.net | Started upload of wah2_anz50_310v_209212_32_872_012026315_0_r2048446855_1.zip
5/22/2020 3:40:01 PM | | Project communication failed: attempting access to reference site
5/22/2020 3:40:01 PM | climateprediction.net | Temporarily failed upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip: transient HTTP error
5/22/2020 3:40:01 PM | climateprediction.net | Backing off 00:02:50 on upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip
5/22/2020 3:40:02 PM | | Internet access OK - project servers may be temporarily down.
5/22/2020 3:42:53 PM | climateprediction.net | Started upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip
5/22/2020 3:47:59 PM | climateprediction.net | Temporarily failed upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip: transient HTTP error
5/22/2020 3:47:59 PM | climateprediction.net | Backing off 00:07:19 on upload of wah2_anz50_32by_209712_32_872_012028010_0_r1951086341_1.zip
5/22/2020 3:48:00 PM | | Project communication failed: attempting access to reference site
5/22/2020 3:48:02 PM | | Internet access OK - project servers may be temporarily down.
5/22/2020 3:52:06 PM | climateprediction.net | [error] Error reported by file upload server: EOF on socket read : asked for 262144, got 163840
5/22/2020 3:52:06 PM | climateprediction.net | Temporarily failed upload of wah2_anz50_310v_209212_32_872_012026315_0_r2048446855_1.zip: transient upload error
5/22/2020 3:52:06 PM | climateprediction.net | Backing off 00:03:07 on upload of wah2_anz50_310v_209212_32_872_012026315_0_r2048446855_1.zip

It now uploads the 33 MB files to 100% but doesn't complete.
4) Message boards : Number crunching : New Work Announcements (Message 62459)
Posted 7 days ago by Profile geophi
Post:
Another batch with 3,150 more Australia & New Zealand models have gone out, at 50 km resolution and covering 32 months. That makes 9450 work units issued total for that sector in the last day.
5) Message boards : Number crunching : "No tasks sent" (Message 62409)
Posted 22 days ago by Profile geophi
Post:
Currently getting this response under Linux and BOINC 7.16.6. Status page says there are 1744 jobs available. No other relevant-looking messages.

Were there similar messages before you upgraded to Ubuntu 20.04?

Did you upgrade over 19.10 or did you do a clean install of 20.04?
6) Message boards : Number crunching : New Model Type HadAM4 (Message 62394)
Posted 26 days ago by Profile geophi
Post:
Not enough memory for a computer that size, either.

He said he's only running a couple at a time though. If he was trying to run 32, or even 16 that would be a whole different matter.
7) Message boards : Number crunching : Work available and being requested but none downloaded (Message 62391)
Posted 26 days ago by Profile geophi
Post:
A separate symptom that might, at a stretch, be related? Normally I can find the results of the weekly credit run on Thursday morning and it hits BAM on Friday morning. So far this week I still cannot see these results - is there a problem?

Sometimes a script crashes, or doesn't get restarted after a server reboot. I e-mailed Andy about the credit thing.

Looks like the credit script ran in the last day. Stats should be updated on cpdn now. Not sure when the credit sites pick up that stats from cpdn.
8) Message boards : Number crunching : New Model Type HadAM4 (Message 62390)
Posted 26 days ago by Profile geophi
Post:
I have several errors: "Model crashed: READDUMP: BAD BUFFIN OF DATA". The Wus have been quite advanced.

Like this one: https://www.cpdn.org/result.php?resultid=21924162

I have limited the climateprediction to two concurrent WUs on this computer, so I was wondering if there is a cure.

It looks like the problems started in early to mid April. Did anything change on that PC or the environment it's in during that time frame?

Some of the crashes, and even some of those that said they completed successfully, had negative theta errors in stderr.txt. While that is sometimes a problem with the initial conditions or parameters for a given task or set of tasks, it can also indicate some hardware instability. If it's in a particularly dusty, or warm environment, that could cause some problems and a thorough cleaning and checking that good air flow through the system is occurring might remove that possibility. Or perhaps CPU, memory and hard disk integrity checking software could be run to determine if any obvious errors are evident? Just a shot in the dark here as I'm not certain it is a hardware/cooling issue but checking those things would at least remove them as possibilities for the problems.
9) Message boards : Number crunching : Work available and being requested but none downloaded (Message 62387)
Posted 28 days ago by Profile geophi
Post:
A separate symptom that might, at a stretch, be related? Normally I can find the results of the weekly credit run on Thursday morning and it hits BAM on Friday morning. So far this week I still cannot see these results - is there a problem?

Sometimes a script crashes, or doesn't get restarted after a server reboot. I e-mailed Andy about the credit thing.
10) Message boards : Cafe CPDN : Status page. (Message 62302)
Posted 13 Apr 2020 by Profile geophi
Post:
And someone has returned a WAH2RI task" How long is it since any of them were sent out???

Looking at tasks for my longest continually running PC, finally retired last fall, the last batches issued for RI might have been Oct 2016. So 3.5 years ago.
11) Message boards : Number crunching : No trickles on webpage (Message 62272)
Posted 31 Mar 2020 by Profile geophi
Post:
This has been reported to the project IT staff.
12) Questions and Answers : Macintosh : Not downloading tasks (Message 62265)
Posted 27 Mar 2020 by Profile geophi
Post:
Hi,
I wonder if this is the case for Quake Catcher, Rosetta@home?? I know SETI is coming to end of life. I look at my BAM stats and all my projects Climate Prediction, Quake Catcher, Rosetta & Seti all show "No previous contact", so I'm not sure what to make other than I'm spending alot of time trying to figure out if tasks are being uploaded back to the mother ship after crunching on my machines. The "tasks" tab in BOINC manager looks like things are crunching along but if i look at individual project sites its indicating 0 as far as any credits go. Thanks for any input!

Only one type of model that can run on a Mac has had any batches released in the last 9 months, that that was last Fall. These are the models/applications and what platforms they can run on

https://www.cpdn.org/apps.php

The hadcm3 short model had some batches last Fall, but nothing since. Batches since then have been for Linux or Windows. I have no idea if/when anything will be released for Macs. There really hasn't been any new work for any of the OS's for 2 months.
13) Questions and Answers : Getting started : Reset my password (Message 62224)
Posted 11 Mar 2020 by Profile geophi
Post:
If you are already logged in, go to https://www.cpdn.org/home.php and click on the password link in the Account Information section.

If you are not logged in, go to https://www.cpdn.org/login_form.php and click on the "forgot password?" link under the word "Password:"

If for some reason you don't have either of those links, make sure you are not trying to access the password change links through a VPN or through the TOR browser which may not work.
14) Questions and Answers : Getting started : Can't establish CPN project on BOINC (Message 62212)
Posted 9 Mar 2020 by Profile geophi
Post:
If your user account hasn't been accessed in a number of years, the issue with logging in may be described by this post in the News forum section.

https://www.cpdn.org/forum_thread.php?id=8255&postid=57145
15) Questions and Answers : Unix/Linux : fedora 30 64 bit (Message 62203)
Posted 6 Mar 2020 by Profile geophi
Post:
Congratulations nairb. It was quite the process to get there but hopefully things will go much smoother from now on.
16) Message boards : Number crunching : CPDN No Longer Supporting Mac BOINC Clients ?? (Message 62198)
Posted 4 Mar 2020 by Profile geophi
Post:
If the OS is Catalina, then there's no 32 bit support now.
And since the apps here are 32 bit, the server would be quite correct.

Looks like that must be the issue. Darwin 19.4 is the latest version of Catalina.
17) Message boards : Number crunching : CPDN No Longer Supporting Mac BOINC Clients ?? (Message 62195)
Posted 4 Mar 2020 by Profile geophi
Post:
Well, from the Applications page, supposedly two applications still support Mac OS X Intel. Hadcm3n (no batch released in the last several years) and hadcm3s (last batch issued in mid-Oct last year).
18) Message boards : Cafe CPDN : SETI@home suspending work distribution 31 March 2020 (Message 62183)
Posted 3 Mar 2020 by Profile geophi
Post:
https://www.bleepingcomputer.com/news/software/seti-home-search-for-alien-life-project-shuts-down-after-21-years/

"It's a lot of work for us to manage the distributed processing of data. We need to focus on completing the back-end analysis of the results we already have, and writing this up in a scientific journal paper," their news announcement stated.
19) Questions and Answers : Unix/Linux : fedora 30 64 bit (Message 62180)
Posted 2 Mar 2020 by Profile geophi
Post:
Here are 8, for different tasks, for one of my Linux PCs. I've probably saved over a dozen across a number of PCs over the last year by deleting the problematic files. Of course I've forgotten to do it a couple times and those were some of the failures. Most of the time cleanly shutting down boinc does not leave the problematic "finished" file in the slots directories. But once in awhile it does, and it might be in multiple slots directories with that shutdown, and thus multiple failures if not cleaned up. I was running an older version of boinc and the line in stderr for the failure is "finish file present too long". The error with a much more recent version of boinc in nairb's crash was "Process still present 5 min after writing finish file; aborting".

https://www.cpdn.org/result.php?resultid=21871647
https://www.cpdn.org/result.php?resultid=21872302
https://www.cpdn.org/result.php?resultid=21782968
https://www.cpdn.org/result.php?resultid=21782986
https://www.cpdn.org/result.php?resultid=21744149
https://www.cpdn.org/result.php?resultid=21744184
https://www.cpdn.org/result.php?resultid=21744785
https://www.cpdn.org/result.php?resultid=21743766
20) Questions and Answers : Unix/Linux : fedora 30 64 bit (Message 62178)
Posted 2 Mar 2020 by Profile geophi
Post:
Oh dear... I needed to restart the desktop. All of the 3 models resumed - then one failed with computing error. It had only been running for 3 days. But I checked how successful the remaining models had been with other computers. Gulp....... not one had been successful.

I needed to restart the desktop machine after an software update which included changes/updates to the fc30 kernel. Maybe this is not a
wise thing to do when a model has started. I doubt this is a fedora 30 problem tho.


Sometimes when suspending and exiting boinc, or just exiting boinc, a file will be left in the slots/x directory/directories where x is a number for the models being run The filename will have a "finished" string as part of the name. It's not supposed to be there, and if it is when the model starts back up, it will self abort. It's a bug in the boinc code, and it doesn't just affect cpdn. Over the years, the error message in stderr has changed somewhat, but the problem still exists. I've taken to checking the slots directories after I exit boinc to make sure that file does not exist in those directories. It's king of a pain, but I've lost several models over the last several years from this bug.


Next 20

©2020 climateprediction.net