climateprediction.net home page
Posts by Paul

Posts by Paul

1) Message boards : Number crunching : EAS batches 1001-4 (Message 70180)
Posted 22 Jan 2024 by Paul
Post:
Just to be clear, this segv problem with these tasks is nothing to do with the files produced by the model - so don't waste time waiting for the OS to do its thing. It's a memory issue related to the model starting up, not reading the files.

Thanks for the useful explanation Glenn.

I had noticed that the failures occurred at start up, so having that confirmed is really useful.

And having had 4 tasks fail today, I've set BOINC to no new work for CPDN for the moment.
2) Message boards : Number crunching : EAS batches 1001-4 (Message 70173)
Posted 21 Jan 2024 by Paul
Post:
You are likely to experience a high failure rate with them doing that even if you do suspend the tasks and wait long enough to ensure all disk writes are finished before closing down BOINC. Doing that does reduce the failure rate.

That seems a bit strange, at least to me. 🙂 I would have thought that as long as the last checkpoint was successful, that would have saved everything and that is where the task would start from next time??

Thanks!
3) Message boards : Number crunching : EAS batches 1001-4 (Message 70167)
Posted 20 Jan 2024 by Paul
Post:
The signal 11 error is a known issue with these tasks. They are particularly prone to it if interrupted so best not to run them on machines that will be restarted. Some will still fail with this error even under perfect conditions however.

I have only one PC. It is restarted each day.

Would you prefer that I not run these tasks?
4) Message boards : Number crunching : BOINC Client Improvements (Message 60790)
Posted 5 Aug 2019 by Paul
Post:
Oxford Uni is in between terms, and BOINC and it's foibles isn't the be all and end all for the Drs and Professors.

Things will happen when they happen.


Les, I appreciate that you're trying to be helpful, but are you really trying to make the case that David (who I infer from your message is highly educated) wasn't aware of this term break when he started this thread? And how do we know that the design studio don't want to get on with things and that they aren't being held up by everything having to happen on Oxford time?

Just what do we have to do to get these Oxford boys to treat volunteers with basic respect?
5) Message boards : Number crunching : BOINC Client Improvements (Message 60780)
Posted 2 Aug 2019 by Paul
Post:
Anyone know what is happening with this?

I've PMed David but there's been no response.
6) Message boards : Number crunching : BOINC Client Improvements (Message 60657)
Posted 13 Jul 2019 by Paul
Post:
I'm prepared to participate in this. How do we apply?
7) Message boards : Number crunching : Upload failures (Message 60580)
Posted 4 Jul 2019 by Paul
Post:
I DID suggest that people suspend running models until the problem was fixed, but it looks like no one listens anymore.


Do we have any idea how many of the 8,945 users with recent credit visit the message boards, and so would see your message?

I assumed that it was going to be a very small proportion, so I didn't see the point in suspending things.

So not so much not listening, just not seeing the point.
8) Message boards : Number crunching : Upload failures (Message 60573)
Posted 3 Jul 2019 by Paul
Post:
Continuing to get this error:

7/3/2019 8:04:28 AM | climateprediction.net | Started upload of wah2_sam50_n6hw_201612_25_822_011884425_0_r639342217_2.zip
7/3/2019 8:04:30 AM | | Project communication failed: attempting access to reference site
7/3/2019 8:04:31 AM | | Internet access OK - project servers may be temporarily down.
7/3/2019 8:04:52 AM | climateprediction.net | Temporarily failed upload of wah2_sam50_n6hw_201612_25_822_011884425_0_r639342217_2.zip: transient HTTP error


That's the error that we're all getting due to the problems that the project is having.

Just let BOINC keep trying. Eventually it will upload.
9) Message boards : Number crunching : Upload failures (Message 60503)
Posted 30 Jun 2019 by Paul
Post:
A reminder for everyone:

News


And a reminder for the project - that post included:

Once todays processing has completed we can give a firm timeline on when the system will return into operation.


That post was made on 27 Jun 2019, 11:58:29 UTC. Have we received the firm timeline for the return to normal operation? It's nearly 3 days later.

Just why is this project so bad at keeping people informed?
10) Message boards : Number crunching : Batch 774 (safr50) (Message 59117)
Posted 28 Nov 2018 by Paul
Post:
Thanks for confirming the problem. I had 4 of these.
11) Message boards : Number crunching : For the betterment of BOINC (Message 56608)
Posted 1 Aug 2017 by Paul
Post:
Thanks for the opportunity to make some suggestions:

1. In BOINC Manager, in the Event Log, you can filter the entries by project only. I'd like to be able to filter by task as well. In both cases, it would be really good if the dropdown box listed the projects and tasks, so that you don't have to go searching for an entry to use as the filter "base".

2. This is a minor annoyance, but I'll put it on the list in case anyone is looking as the scheduler. I use app_config.xml files https://boinc.berkeley.edu/trac/wiki/ClientAppConfig to restrict the number of tasks that can run for projects that don't respect the Use at most N % CPU time option. However, the scheduler doesn't seem to cope with the restriction on the number of tasks properly, and there are occasions where even though there is work available, a core can sit idle presumably because the scheduler wants to run a project with a restricted number of tasks, but can't because of the restriction, and doesn't want to run a project which isn't restricted.
12) Message boards : Number crunching : MORE FAILED DOWNLOADS (Message 56209)
Posted 12 May 2017 by Paul
Post:
Stalled downloads have downloaded here too.

Thanks Dave Jackson for keeping us up to date.
13) Message boards : Number crunching : No work for Windows? (Message 52780)
Posted 31 Oct 2015 by Paul
Post:
And there hasn't been any Windows or Mac tasks for several weeks now.


Thanks.

But if that's the case, why is this project so aggressive about sending the following message?

31/10/2015 18:16:59 | climateprediction.net | Message from server: No work available for the applications you have selected. Please check your project preferences on the web site.


If you're going to send that message every couple of days when I'm not able to get a CPDN task, shouldn't there be some work available?
14) Message boards : Number crunching : What has the project learned from the recent outage? (Message 52779)
Posted 31 Oct 2015 by Paul
Post:
From what I've seen, communication about the recent outage has been extremely poor, and even now that things are starting to return to normal, this is continuing.

I went to Twitter https://twitter.com/CPDN_BOINC but there was very little information there. Apparently, we were supposed to know to go to http://boinc.berkeley.edu/dev/forum_thread.php?id=10279 for information.

Has the project learned anything from this outage?

Will we be getting any information about what the project has learned?

Has the project at least learned that it needs to do much much better at keeping people up to date about what is happening?
15) Message boards : Number crunching : No work for Windows? (Message 52775)
Posted 31 Oct 2015 by Paul
Post:
ALL tasks come from researchers in various climate centres around the world.


So you're saying that the lack of tasks for Windows has nothing to do with the recent problems?
16) Message boards : Number crunching : HadAM3P - permanent HTTP error download failed? (Message 51799)
Posted 9 Apr 2015 by Paul
Post:
Yes, just seen the same messages at around 16:30 UK time.

The Projects tab is showing communication deferred for climateprediction.net, so I'm hoping that the download will complete sometime.
17) Message boards : Number crunching : hadcm3s_7 errors (Message 51465)
Posted 25 Feb 2015 by Paul
Post:
Thanks.

That'll be the problem, but I'll wait until things get fixed rather than reinstalling.
18) Message boards : Number crunching : hadcm3s_7 errors (Message 51461)
Posted 25 Feb 2015 by Paul
Post:
I've had 3 of these:

- http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=9493906
- http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=9486021
- http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=9451109

that have errored out for me and 1, 2 or 3 others.

Is there a problem with these work units?

Thanks
19) Message boards : Number crunching : Backup document no longer available (Message 49209)
Posted 25 May 2014 by Paul
Post:
Thanks - I forgot that these forums only search the past 30 days by default.

Is there any up to date information on restoring from a backup when you're running more than one project? That's my main concern. Hopefully not something that I'll have to face, but it would be good to know the process before the problem occurs.

20) Message boards : Number crunching : Backup document no longer available (Message 49206)
Posted 24 May 2014 by Paul
Post:
The backup document that is mentioned in http://www.climateprediction.net/support/technical-faq/#What_are_the_implications.3F and http://boincfaq.mundayweb.com/index.php?language=1&view=97 no longer exists. The current link is http://www.boinc-wiki.info/Backup_BOINC

Could someone please update these documents or post a current link to the document in this thread.

Thanks!




©2024 climateprediction.net