climateprediction.net home page
Posts by Bryn Mawr

Posts by Bryn Mawr

1) Message boards : News : BOINC Needs Votes at a UN Upcoming Forum (Message 70672)
Posted 26 Mar 2024 by Bryn Mawr
Post:
Done
2) Message boards : Number crunching : processors, memory, performance and heat. (Message 70578)
Posted 1 Mar 2024 by Bryn Mawr
Post:
At the expense of thermally stressing the CPU which does not happen if you use less cores for 100% of the time.
That will have a bigger impact on the throughput (tasks completed per day). A 10% drop in CPU use on all cores is all I need on my older machines to get temps I'm happy with. Taking 1 core away from 4 is the same as a 25% CPU use reduction. I prefer having the finer control of %cpu available.


If you’re happy with the reduced life of the CPU then fine, just be aware that you are making that choice.
3) Message boards : Number crunching : processors, memory, performance and heat. (Message 70575)
Posted 29 Feb 2024 by Bryn Mawr
Post:
Unless its causing WU crashes, it really helps with controlling heat issues on some rigs while still using all cores.


Use at most x% of cpu time should be removed from boinc source code


At the expense of thermally stressing the CPU which does not happen if you use less cores for 100% of the time.
4) Message boards : Number crunching : New Work Announcements 2024 (Message 70347)
Posted 9 Feb 2024 by Bryn Mawr
Post:
The priority is to get the replacement EAS25 batches out (for aborted 1002-1004) once the troublesome files have been corrected and tested. It's highly likely they will be using the new WaH2 app which has been in development & tested not to suffer from the excessive failures. It has already been added to the main site as v8.29 of the WAH2 Region Independent app (or wah2-ri for short). If you receive a workunit for wah2-ri 8.29, you're running the new app. George (aka geophi) noted in testing the new app is ~10% faster than the old one.

It's been done this way so as not to interfere with currently running wah2 workunits. There is also a new linux version of wah2-ri which is currently in test (not on the main site yet).

The OpenIFS batch is ready & has been tested, but I'm not happy with some of the failures coming from the monitor code (not the model). To be discussed.

The HadAM4 I think is about ready, maybe needs bit more testing.

So, alot of Windows & Linux work coming soon.

I'll be able to confirm more next week after the usual Monday CPDN meeting.

HTH


Many thanks :-)
5) Message boards : Number crunching : New Work Announcements 2024 (Message 70345)
Posted 9 Feb 2024 by Bryn Mawr
Post:
Copied from old thread from Glen.

Forthcoming batches

The following batches are planned for Jan (or early Feb).

a/ Weather@Home (Windows)*

NZ25 - New Zealand 25km grid, natural forcings.
EAS25 - East Asia 25km grid, range of different forcings.


b/ HadAM4 (Linux)
N216 climatological runs producing high frequency northern-hemisphere output.

c/ OpeniFS (Linux)
Low resolution batch to look at variation of model results across different hardware


*We'll also roll out updated versions of the apps for Weather@Home, HadAM4, & HadSM4 to fix issues with the models failing, particularly on restarts. Although we hope to get these out before the Weather@Home batches it may not happen due to time pressure from the projects funding these batches.

Hoping some of these might come sooner rather than later but I have given up holding my breath!


Any further news of these?

I need to return a result to get rid of the spurious RAC figure from the correction that was done months ago (308,000 where my boxes are capable of 80,000 at best).
6) Message boards : Number crunching : Relative performance question. (Message 70038)
Posted 10 Nov 2023 by Bryn Mawr
Post:
Presumably the _bl is waiting for memory fetch or disk io a lot more than the _ps which is happily sitting in loops computing and racking up the flops.
Nothing to do with memory nor IO. As I said previously they are two very different model configurations. The 'BL' app is running an idealised planet with no land, so all the land surface process code in the model does not run. The PS app is a normal model forecast but with perturbed parameters which potentially gives a different execution time for each individual forecast.


It does not matter what a given CPU is crunching on, it will crunch at the same rate unless it is doing something other than crunch, I was trying to work out what that might be.
7) Message boards : Number crunching : Relative performance question. (Message 70033)
Posted 9 Nov 2023 by Bryn Mawr
Post:
Presumably the _bl is waiting for memory fetch or disk io a lot more than the _ps which is happily sitting in loops computing and racking up the flops.
8) Message boards : Number crunching : Recent Average Credit. Correct for user, zero for computers. (Message 69650)
Posted 27 Sep 2023 by Bryn Mawr
Post:
Since the recent “adjustment” to the credit scores my RAC has been 261384 which is definitely not right.

I have processed no tasks since the adjustment so that might be why the RAC is not being recalculated.

ETA Hosts 1537273 and 1537133.

In looking up my host ids I note that my RAC within CPDN is zero so the erroneous figure is only showing in Boinc Manager and BoincStats and this post probably needs to be ignored.
9) Message boards : Number crunching : Credit handed out weekly? (Message 69596)
Posted 6 Sep 2023 by Bryn Mawr
Post:
Confirmation from Andy@CPDN that the export of credit is working again.


Please pass on my thanks 🙏
10) Message boards : Number crunching : Credit handed out weekly? (Message 69589)
Posted 6 Sep 2023 by Bryn Mawr
Post:
Boincstats seems confused, it says I haven't changed position in the last day/week/month, and I've received no credits, but I know I have. I was 583rd before the stats stopped exporting, now I'm 520th.

Doesn't bother me really as I just note the positions on my own graph.

Hopefully now I can run Windows or Linux tasks, I can plough ahead soon.


As always, BS shows intra-day updates under “Today” and only integrates the accumulation into the day/week/month totals and the team/country/world positions at 15:00 UTC during the daily update.
11) Message boards : Number crunching : Credit handed out weekly? (Message 69585)
Posted 5 Sep 2023 by Bryn Mawr
Post:
It looks as though the credit script has run and the extra points have been released to the stats sites.

For the next 60 days my totals are going to look very small on the charts :-)
12) Message boards : Number crunching : Credit Question Answered (Message 68857)
Posted 6 Jun 2023 by Bryn Mawr
Post:
If some extra credit is awarded accidentally, is that a bad thing really? There have been plenty of times where credit are lost accidentally, right?


It’s the volume of the extra credits that’s the problem, that and the extreme RAC figures generated that do not appear to be coming down.

OK, we occasionally loose a few thousand credits in a glitch but gaining a few million credits is a bad thing.
13) Message boards : Number crunching : How big a task can I run? (Message 68729)
Posted 13 May 2023 by Bryn Mawr
Post:
I’ve always understood it to be the size of your swap file.

If that is correct it will not limit the size of task you can run.
14) Message boards : Number crunching : Big credit jump! (Message 68704)
Posted 10 May 2023 by Bryn Mawr
Post:
Credit total is up by 20%.

RAC, I wish :-) ( My total RAC is running at around 50,000, my RAC for CPDN is now 260,000+).
15) Message boards : Cafe CPDN : World Community Grid mostly down for 2 months while transitioning (Message 68633)
Posted 31 Mar 2023 by Bryn Mawr
Post:
I had a 24 core per region quota limit, though I've not tried to raise it. I was able to just create some 4/8/22 core VMs and they're purring away nicely.

Rosetta@Home is out of tasks, though there are still some retries going around.

I suppose next steps to run out of work would be Folding@Home CPU or maybe Einstein@Home, though I don't particularly find pulsars/neutron stars/etc particularly interesting projects, and certainly not of much "Earthly importance."


The next project to run out of work will be TN-Grid in about 10 days time :-(

Of my 5 projects that will be 4 of them without work.
16) Message boards : Number crunching : Server Status page questions (Message 68607)
Posted 19 Mar 2023 by Bryn Mawr
Post:

Which applications exactly are the "region independent tasks?"
I keep getting hadam4 WUs and I sure do NOT want to waste my electric bill on useless garbage.
If there's obsolete WUs circulating then the project should issue "server aborts" for all of them and clear the decks of the flotsam and jetsam.



Fourth line down :-

Weather At Home 2 (wah2) (region independent) 0 4731 --- 0

They’re Windows tasks not related to your hadam4 WUs
17) Message boards : Number crunching : The uploads are stuck (Message 68154)
Posted 31 Jan 2023 by Bryn Mawr
Post:
I'll mention it and see what response I get, credit is actually quite a pain for a project to have to manage from what I've learnt.
Returning valid results for the researchers to analyze is the important matter, credit is a 'nice to have'. Others may think otherwise, but I won't mind if the credit update is rare or doesn't happen.

I tend to agree. For myself, credit is more about another tool to spot problems than anything else. But I get that for some it is a motivating factor.

Some kind of accounting of work done both in relative and absolute measure is important I'd argue. I have little doubt that at one point or another everyone looked at their credit and standings even if to just get an idea of the level of contribution over time. It's pretty normal behavior and I have no doubt that everyone would be asking for some way of knowing of how much one's contributed if there was no such measure provided at all. I agree that complicated credit system is unnecessary and a weekly credit run is fine. Unlike other projects, on CPDN task counts don't get erased from when you first join so you can actually tell how many tasks you've processed and calculate your error rate long term. I think that's kind of a nice thing. Although a cluttered and somewhat uninformative Project Status page is the trade-off, is my guess. Badges might be nice as they'll provide incentive to likely many people. Although, I'd rather see a number of other things improved first before concerning with badges.


Rather than the absolute credit count (although that is nice as well) I find the RAC a more useful measure of how well the system is performing - not that I’m hinting or anything :-)
18) Message boards : Number crunching : One of my computers is hoarding tasks and I don't know why (Message 68077)
Posted 27 Jan 2023 by Bryn Mawr
Post:
Well, I probably should have debugged this first. <work_fetch_debug> did the trick and this likely has nothing to do with version. For the top priority project that was picked, I saw this repeatedly showing up in every fetch cycle.

[work_fetch] deferring work fetch; upload active

I was able to observe the same issue happening with LHC when it's slowly uploading after a batch of tasks finish. Turns out if the current project picked for fetching is constantly uploading, the fetch will just be deferred and boinc won't try the next project. It's the right behavior if we assume upload should be quick, but sometimes uploading is going to take forever... Guess I need to build up a job cache big enough for the entire period of uploads just in case the project uploading ends up getting picked for fetching.


I must have been lucky and done a work fetch during a period of project backoff. During the several periods of upload problems I’ve been receiving tasks as normal.
19) Message boards : Number crunching : Why does this task fail ? (Message 67814)
Posted 17 Jan 2023 by Bryn Mawr
Post:
Okay, here is the next failed WU and I ask again, why has this WU failed: https://www.cpdn.org/result.php?resultid=22268795

I'm seeing not stderr out.
That's quite strange.


And return code zero.

Does that suggest that it failed in the wrapper?
20) Message boards : Number crunching : no credit awarded? (Message 67754)
Posted 15 Jan 2023 by Bryn Mawr
Post:
BOINC is showing its weekly increase in credit from CPDN today. If not showing on BOINCSTATS, I wonder if CPDN have changed something so BOINCSTATS is looking in the wrong place or the page where it looks hasn't been updated for some reason. If it looks like that is the case I will email Andy.


It’s the CPDN account management page that’s not being updated before it goes to BoincStats. I think the delay to BS is just that the update missed the cutoff.


Next 20

©2024 climateprediction.net