climateprediction.net home page
New work Discussion

New work Discussion

Message boards : Number crunching : New work Discussion
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 40 · 41 · 42 · 43

AuthorMessage
Jim1348

Send message
Joined: 15 Jan 06
Posts: 473
Credit: 21,432,250
RAC: 43,160
Message 62045 - Posted: 26 Jan 2020, 16:11:10 UTC - in response to Message 62043.  

And on my i7-9700 (which has eight full cores), it checkpoints at 23 minutes. But that is again with limiting the N216 to running on only four cores. The other four cores are on TN-Grid, which seems to be an easy project for this purpose.

In general, I find that I need to limit any of my CPUs (Intel coffee lake or Ryzen) to four cores for the N216, but can put just about anything else on the other cores without much ill effect. Beyond four cores, it drops off a cliff.
ID: 62045 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 18 Feb 17
Posts: 71
Credit: 4,564,803
RAC: 12,152
Message 62046 - Posted: 26 Jan 2020, 17:00:06 UTC - in response to Message 62045.  

And on my i7-9700 (which has eight full cores), it checkpoints at 23 minutes. But that is again with limiting the N216 to running on only four cores. The other four cores are on TN-Grid, which seems to be an easy project for this purpose.

In general, I find that I need to limit any of my CPUs (Intel coffee lake or Ryzen) to four cores for the N216, but can put just about anything else on the other cores without much ill effect. Beyond four cores, it drops off a cliff.

I vaguely remember discussion of Rosetta eating up l3 cache as well, but can't find the discussion anywhere.
Is this still true today and should I be limiting it alongside the n216 and n144?
ID: 62046 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 473
Credit: 21,432,250
RAC: 43,160
Message 62047 - Posted: 26 Jan 2020, 17:15:33 UTC - in response to Message 62046.  

I vaguely remember discussion of Rosetta eating up l3 cache as well, but can't find the discussion anywhere.
Is this still true today and should I be limiting it alongside the n216 and n144?

Good question. If you look, I think you will find that I initiated that subject on Rosetta. The answer is that insofar as I can tell, Rosetta works OK with CPDN, though at the moment I like TN-Grid even better. But the "cache" issue is a bit tricky. It seems to be not just the size of the cache, or else I could run a lot more N216 on my Ryzen 3600 than my Ryzen 2600, for example. Maybe it is how the cache is used, or even a question of the L2 cache rather than the L3 cache.

At any rate, you ultimately have to try it out. I don't see much problem with the N144 though.
ID: 62047 · Report as offensive     Reply Quote
alanb1951

Send message
Joined: 31 Aug 04
Posts: 15
Credit: 6,194,978
RAC: 10,818
Message 62050 - Posted: 27 Jan 2020, 6:29:47 UTC - in response to Message 62046.  
Last modified: 27 Jan 2020, 6:35:04 UTC

@Wolfman1360
I vaguely remember discussion of Rosetta eating up l3 cache as well, but can't find the discussion anywhere.
Is this still true today and should I be limiting it alongside the n216 and n144?

Jim1348 has referred to local threads where this has come up; if you look in the threads about UK Met Office HadAM4 at N216 resolution and UK Met Office HadAM4 at N144 resolution you'll find several mentions of L3 cache bashing (especially in the N216 thread, but in this message in the N144 thread I actually replied to one of your posts, talking about workload mixes (and again in this message)... Jim1348 (and others) had some good contributions in those threads too. I don't recall many explicit references to Rosetta, but WCG MIP1 (which uses Rosetta) got some dishonourable mentions...

You may also have seen (or even participated in) threads about MIP1 at WCG -- because of the model construction it uses, the rule of thumb is that one MIP1 per 4 or 5 MB of L3 cache! I haven't got time to track those down at the moment - sorry!

For what it's worth, if you run MIP1 alongside N216 you'll see the same sort of hit as if running extra N216 tasks; N144 is nowhere near as bad!

Cheers - Al.

[Edited to fix a broken link, then to fix a typo I'd missed!]
ID: 62050 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 18 Feb 17
Posts: 71
Credit: 4,564,803
RAC: 12,152
Message 62125 - Posted: 15 Feb 2020, 6:48:51 UTC - in response to Message 62050.  
Last modified: 15 Feb 2020, 6:51:17 UTC

@Wolfman1360
I vaguely remember discussion of Rosetta eating up l3 cache as well, but can't find the discussion anywhere.
Is this still true today and should I be limiting it alongside the n216 and n144?

Jim1348 has referred to local threads where this has come up; if you look in the threads about UK Met Office HadAM4 at N216 resolution and UK Met Office HadAM4 at N144 resolution you'll find several mentions of L3 cache bashing (especially in the N216 thread, but in this message in the N144 thread I actually replied to one of your posts, talking about workload mixes (and again in this message)... Jim1348 (and others) had some good contributions in those threads too. I don't recall many explicit references to Rosetta, but WCG MIP1 (which uses Rosetta) got some dishonourable mentions...

You may also have seen (or even participated in) threads about MIP1 at WCG -- because of the model construction it uses, the rule of thumb is that one MIP1 per 4 or 5 MB of L3 cache! I haven't got time to track those down at the moment - sorry!

For what it's worth, if you run MIP1 alongside N216 you'll see the same sort of hit as if running extra N216 tasks; N144 is nowhere near as bad!

Cheers - Al.

[Edited to fix a broken link, then to fix a typo I'd missed!]


Thanks for all of these.
So far I am seeming to be doing okay, but I may have bitten off a little more than I can chew. I have an old Dual Opteron plugging away at 3 N216 - I figure a month that they are actually worked on is better than a month of sitting there with nothing grabbing them. I am exaggerating, of course - it shouldn't take quite that long since it is a dedicated cruncher, but who knows.
I tend to stay away from MIP at WCG and have recently been crunching Asteroids at home alongside CPDN and Rosetta, though I do have one machine running TN grid and it seems to be doing fine as well. My RAC has drastically decreased but should be raising soon enough after playing with the config for CPDN. I am still being very conservative since I'd rather not have computing errors, as has happened a few times already on my Ryzen 1700.
ID: 62125 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2653
Credit: 3,184,669
RAC: 1,147
Message 62126 - Posted: 15 Feb 2020, 7:49:22 UTC - in response to Message 62125.  

My RAC has drastically decreased but should be raising soon enough after playing with the config for CPDN.


I am not sure where CPDN sits in the tables for credit for time spent crunching. I know it isn't at the top but I suspect there are probably projects below it as well.
ID: 62126 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 9 Dec 05
Posts: 68
Credit: 11,135,631
RAC: 9,789
Message 62129 - Posted: 17 Feb 2020, 11:21:16 UTC - in response to Message 62126.  
Last modified: 17 Feb 2020, 11:21:59 UTC

My RAC has drastically decreased but should be raising soon enough after playing with the config for CPDN.


I am not sure where CPDN sits in the tables for credit for time spent crunching. I know it isn't at the top but I suspect there are probably projects below it as well.

This comparison https://boinc.netsoft-online.com/e107_plugins/boinc/get_cpcs.php would suggest that CPDN gives less credits per CPU second compared to just about any other project. Probably it doesn't list all projects and includes projects using GPUs as well. At least it is missing the comparison between LHC and CPDN which are both CPU only projects that I participate in.
ID: 62129 · Report as offensive     Reply Quote
ed2353

Send message
Joined: 15 Feb 06
Posts: 132
Credit: 32,534,618
RAC: 13,033
Message 62149 - Posted: 24 Feb 2020, 16:57:51 UTC

Any indications (perhaps test batches?) of new Windows work in the foreseeable future?
My new Ryzen is getting hungry.
Currently it is chewing on two Linux tasks via VMPlayer and LinuxMint, but they seem to be slow going.
ID: 62149 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2653
Credit: 3,184,669
RAC: 1,147
Message 62150 - Posted: 24 Feb 2020, 18:15:17 UTC - in response to Message 62149.  

The only thing recently in testing was the openIFS type tasks which are the 64bit Linux tasks but even they do not as far as I know herald new work soon.

That said, I have said things like that before and then work has appeared. In the same way, there have been times when I said new work has been on the way and it has been a loooong time coming.
ID: 62150 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7163
Credit: 22,686,550
RAC: 10,671
Message 62151 - Posted: 24 Feb 2020, 20:45:30 UTC - in response to Message 62149.  

There are 4 researchers using Windows:
Pacific North West
Mexico (Central America and South America)
Korea
ANZ

All of these are probably still waiting for enough of the thousands of models they issued late last year to be returned, so that they can anaylise the results.
ID: 62151 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2653
Credit: 3,184,669
RAC: 1,147
Message 62154 - Posted: 25 Feb 2020, 10:11:56 UTC - in response to Message 62151.  

There are 4 researchers using Windows:
Pacific North West
Mexico (Central America and South America)
Korea
ANZ

All of these are probably still waiting for enough of the thousands of models they issued late last year to be returned, so that they can anaylise the results.


Those batches all between 73 and 79% success. 26-20% in progress and 1% hard fails. I don't know if the percentage needed to get good results varies from batch to batch depending on how many more they put out compared with what is needed?
ID: 62154 · Report as offensive     Reply Quote
Previous · 1 . . . 40 · 41 · 42 · 43

Message boards : Number crunching : New work Discussion

©2020 climateprediction.net