climateprediction.net (CPDN) home page
Thread 'New work Discussion'

Thread 'New work Discussion'

Message boards : Number crunching : New work Discussion
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 59 · 60 · 61 · 62 · 63 · 64 · 65 . . . 91 · Next

AuthorMessage
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63580 - Posted: 1 Mar 2021, 19:58:48 UTC

I am inclined to agree with Peter. If the BOINC scheduler simply assigned them in order to available cores, you would get them. It is because it is trying to juggle the miserable resource share (and whatever else it manages to think up) that it gets itself into a dead-end situation. They could do better by getting rid of it entirely.


If tasks went out purely in order of first in the hopper, first out to machines, it would be a couple of weeks before any went to Linux machines. As it is a significant number are going to Linux as well as to Mac's. What I don't have any idea about is how the algorithm that decides works at the servers or how configurable that is by projects. As it is I see there are tasks from batches quite a long way back such as 842 and 861 among others still to go out. However I don't know if they have had extra poured into the hopper at any stage like the current hadam3c tasks.

And any project configuration of the algorithm will be done on the basis of what the scientists want rather than the crunchers.
Please do not private message myself or other moderators for help. This limits the number of people who are able to help and deprives others who may benefit from the answer.
ID: 63580 · Report as offensive
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63581 - Posted: 1 Mar 2021, 20:03:43 UTC - in response to Message 63580.  

If tasks went out purely in order of first in the hopper, first out to machines, it would be a couple of weeks before any went to Linux machines.
If they can't figure out the difference between Linux and Mac, they have no business with the more esoteric stuff.


And any project configuration of the algorithm will be done on the basis of what the scientists want rather than the crunchers.
Yes, but I decide what projects I am on.
ID: 63581 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63582 - Posted: 1 Mar 2021, 20:15:24 UTC - in response to Message 63581.  

If they can't figure out the difference between Linux and Mac, they have no business with the more esoteric stuff.

It is only the Linux machines that are affected. They are going out to Mac's fine.
Yes, but I decide what projects I am on.

Or not if you are running CPDN tasks or not which bit of the project anyway.

Personally I would have preferred that they kept the ability to choose what task types we ran. - My laptop which has lost video both to its own screen and the Dsub, (don't know about HDMI because I don't have a lead) is horrendously slow on HADAM4 tasks especially the N216 resolution ones would if still running have been better suited to the hadam3c tasks. but I can live with that. I don't run anything but CPDN unless it has run out of work and with the pattern of famine or glut and nothing between the two, we seem to be very much in the glut phase for a bit at least as far as Linux goes though that might be different if it were not for all the people who don't notice they are crashing everything they get!
Please do not private message myself or other moderators for help. This limits the number of people who are able to help and deprives others who may benefit from the answer.
ID: 63582 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,879,198
RAC: 4,930
Message 63583 - Posted: 1 Mar 2021, 21:01:07 UTC

... got two from batch #837 on my Mac: it looks like a couple of weeks per model on an old Mac mini.
ID: 63583 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 63584 - Posted: 1 Mar 2021, 21:06:25 UTC - in response to Message 63577.  

Everything's a Boinc scheduler problem. I see it doing something stupid on one of my 7 machines almost every day. There's meant to be a new version in the pipeline, but I bet they just introduce even more bugs. It took me ages to convince them that it can't tell the difference between time taken on 24 cores and time taken per core.
I don't see this as a problem except for those who would like to choose which tasks to run and that is a project issue not a scheduler one.
Did you reply to the right thing here? The bug I found was to do with multi-core tasks. My client was downloading 24 times the work I asked for, because it thought I could run 24 of 24-core tasks at once, as it was assuming one was loaded per core. Doesn't matter here as they're single core tasks.
ID: 63584 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 63585 - Posted: 2 Mar 2021, 1:26:39 UTC

Just got one on my AirBook.
Hardest part was remembering how to get BOINC started on the computer.
ID: 63585 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63586 - Posted: 2 Mar 2021, 7:42:00 UTC - in response to Message 63584.  

Did you reply to the right thing here? The bug I found was to do with multi-core tasks. My client was downloading 24 times the work I asked for, because it thought I could run 24 of 24-core tasks at once, as it was assuming one was loaded per core. Doesn't matter here as they're single core tasks.


The bit I was replying to was,
Everything's a Boinc scheduler problem.


When I looked this morning 785 of these were listed as running. I haven't tried to work out how many are on Mac and how many on Linux.
Please do not private message myself or other moderators for help. This limits the number of people who are able to help and deprives others who may benefit from the answer.
ID: 63586 · Report as offensive
KAMasud

Send message
Joined: 6 Oct 06
Posts: 204
Credit: 7,608,986
RAC: 0
Message 63587 - Posted: 2 Mar 2021, 8:16:47 UTC - in response to Message 63569.  

But I get the point that you could be doing more work with a mix because the hadcm3s don't hammer the cache memory like the others. Nor do they require as much RAM. I don't need to be running much else on top of 5 tasks to use up all of my 32GB. - Will be buying another 32 some time to make things a bit more reasonable.

------------------------------
I can run two and a half on my 16 GB's of RAM. Half, because one hangs around "waiting for memory".
ID: 63587 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 63589 - Posted: 2 Mar 2021, 15:09:23 UTC - in response to Message 63586.  

Did you reply to the right thing here? The bug I found was to do with multi-core tasks. My client was downloading 24 times the work I asked for, because it thought I could run 24 of 24-core tasks at once, as it was assuming one was loaded per core. Doesn't matter here as they're single core tasks.


The bit I was replying to was,
Everything's a Boinc scheduler problem.


When I looked this morning 785 of these were listed as running. I haven't tried to work out how many are on Mac and how many on Linux.
The original problem stated was the short tasks won't run if the long tasks are running. Something funny going on there.
ID: 63589 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63590 - Posted: 2 Mar 2021, 18:50:41 UTC - in response to Message 63589.  

The original problem stated was the short tasks won't run if the long tasks are running. Something funny going on there.


But they will. Several are running on Linux as well as the Mac ones. If it had been first in the hopper, first out, none would be running on Linux and they would all be going to Mac hosts. As it is they are going to both though I have no idea of the mechanism that dictates how many of these go to Linux compared with the Hadam4 tasks in the queue. Given that there are so many of the hadam4 tasks I would have made the hadcm3's Mac only to give the Mac users a longer batch of tasks.

But as a cruncher and volunteer moderator I can do no more than suggest things. Sometimes they get taken up, sometimes they don't.
ID: 63590 · Report as offensive
Bryn Mawr

Send message
Joined: 28 Jul 19
Posts: 150
Credit: 12,830,559
RAC: 228
Message 63591 - Posted: 2 Mar 2021, 20:09:50 UTC - in response to Message 63590.  

The original problem stated was the short tasks won't run if the long tasks are running. Something funny going on there.


But they will. Several are running on Linux as well as the Mac ones. If it had been first in the hopper, first out, none would be running on Linux and they would all be going to Mac hosts. As it is they are going to both though I have no idea of the mechanism that dictates how many of these go to Linux compared with the Hadam4 tasks in the queue. Given that there are so many of the hadam4 tasks I would have made the hadcm3's Mac only to give the Mac users a longer batch of tasks.

But as a cruncher and volunteer moderator I can do no more than suggest things. Sometimes they get taken up, sometimes they don't.


Yep, I have a HadCM3 running alongside 2 HadAM4 WUs as we speak.

Looks to be a 240 month model issuing 1 trickle per year at about 5 hour intervals.
ID: 63591 · Report as offensive
ProfileAlan K

Send message
Joined: 22 Feb 06
Posts: 491
Credit: 31,216,350
RAC: 15,494
Message 63603 - Posted: 6 Mar 2021, 23:44:31 UTC - in response to Message 63575.  

I've picked up 2 of these. One is a168 month model, the other is 240 months. Looks like one trickle per year. Running alongside some N216s.
ID: 63603 · Report as offensive
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63604 - Posted: 7 Mar 2021, 3:42:36 UTC - in response to Message 63591.  

[quote]The original problem stated was the short tasks won't run if the long tasks are running. Something funny going on there.


But they will. Several are running on Linux as well as the Mac ones. If it had been first in the hopper, first out, none would be running on Linux and they would all be going to Mac hosts.

I don't think you see it. It is not the hopper in the server, but in your BOINC client.

I just started up a newly-built Ryzen 3600 on Ubuntu 20.04.2, and allowed four CPDN to download. They are all N216.
https://www.cpdn.org/results.php?hostid=1516019

I keep the default buffer of 0.1 + 0.5 days. I might be able to get a different mix with a larger buffer, but it would still be random and not what I want.
That limits the amount that I am willing to do.
ID: 63604 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63605 - Posted: 7 Mar 2021, 10:19:17 UTC - in response to Message 63604.  
Last modified: 7 Mar 2021, 10:23:46 UTC

I keep the default buffer of 0.1 + 0.5 days. I might be able to get a different mix with a larger buffer, but it would still be random and not what I want.
That limits the amount that I am willing to do.


Interesting how the random bit works, I have my cache settings at 0.1 and 0.1 days, just downloaded some more work and now have 8 hadcm3 tasks running alongside 4 hadam4h ones. I have suspended the remaining 4 hadam4h tasks because of the cache memory limitations which would slow everything down a lot if I ran them as well.

pleased to see minimal increase in temperatures going up from four tasks to twelve. I have now set the project back to no new tasks and will wait till the four running hadam4 tasks have completed before requesting any more work. This way seems to be the closest I can get to controlling the tasks I get. Had I been able to choose task types, I would have used the project pages to stop any more hadam4h tasks being downloaded till the current ones were finished.
ID: 63605 · Report as offensive
mikey

Send message
Joined: 18 Nov 18
Posts: 21
Credit: 6,635,794
RAC: 2,524
Message 63606 - Posted: 7 Mar 2021, 13:10:20 UTC - in response to Message 63378.  

The UK doesn't have problems with slow mail or expensive prices, as we have 10 competing couriers.


We didn't have problems with slow mail or expensive prices before we had 10 competing couriers.


And THAT is one of the conspiracy theories about the slow downs in the USPS mail system, the guy running the thing has a HUGE investment in a 3rd party mail carrier and every piece of mail that's delayed his company gets a chunk of to deliver!!

I just moved and all of my mail is now being forwarded to a new address....instead of telling the main distribution place that my mail is being forwarded it instead still goes to my local post office who then says 'nope can't deliver this because it's being forwarded' and then sends it BACK to the distribution center for rerouting to the new address!! This delay means nothing is getting here on time and often 3rd party people are delivering it!! I have several places that STILL don't allow electronic payment of the bills so I had to do a pre-payment of several months just so everything keeps happening as I like it.
ID: 63606 · Report as offensive
mikey

Send message
Joined: 18 Nov 18
Posts: 21
Credit: 6,635,794
RAC: 2,524
Message 63607 - Posted: 7 Mar 2021, 13:13:24 UTC - in response to Message 63605.  

I keep the default buffer of 0.1 + 0.5 days. I might be able to get a different mix with a larger buffer, but it would still be random and not what I want.
That limits the amount that I am willing to do.


Interesting how the random bit works, I have my cache settings at 0.1 and 0.1 days, just downloaded some more work and now have 8 hadcm3 tasks running alongside 4 hadam4h ones. I have suspended the remaining 4 hadam4h tasks because of the cache memory limitations which would slow everything down a lot if I ran them as well.

pleased to see minimal increase in temperatures going up from four tasks to twelve. I have now set the project back to no new tasks and will wait till the four running hadam4 tasks have completed before requesting any more work. This way seems to be the closest I can get to controlling the tasks I get. Had I been able to choose task types, I would have used the project pages to stop any more hadam4h tasks being downloaded till the current ones were finished.


I would LOVE for the Project to make those choices available to us!!! That way my underpowered machines aren't getting tasks they struggle with and can wait patiently for the ones they can handle much more easily.
ID: 63607 · Report as offensive
ProfileIain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,879,198
RAC: 4,930
Message 63628 - Posted: 8 Mar 2021, 23:11:28 UTC - in response to Message 63603.  

[Alan K wrote:]I've picked up 2 of these. One is a168 month model, the other is 240 months. Looks like one trickle per year. Running alongside some N216s.

There does seem to be a variety of lengths in that batch. To add to your list I've just got a 360 month version, which will take a long time to run on my ancient Mac.

(I had better check the upload limit as, if the Zips are the same size as the 240 month version, then 36 x 95 MB plus restarts is quite large ...)
ID: 63628 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 63629 - Posted: 9 Mar 2021, 8:18:55 UTC - in response to Message 63628.  
Last modified: 9 Mar 2021, 8:19:21 UTC

(I had better check the upload limit as, if the Zips are the same size as the 240 month version, then 36 x 95 MB plus restarts is quite large ...)

In another thread there is a cruncher talking about putting a Ryzen thread ripper with 64 threads to work. 64x36x95=218.88GB! I hope that they are not like myself still on copper wires!
ID: 63629 · Report as offensive
mikey

Send message
Joined: 18 Nov 18
Posts: 21
Credit: 6,635,794
RAC: 2,524
Message 63630 - Posted: 9 Mar 2021, 13:30:50 UTC - in response to Message 63629.  

(I had better check the upload limit as, if the Zips are the same size as the 240 month version, then 36 x 95 MB plus restarts is quite large ...)

In another thread there is a cruncher talking about putting a Ryzen thread ripper with 64 threads to work. 64x36x95=218.88GB! I hope that they are not like myself still on copper wires!


LOL
ID: 63630 · Report as offensive
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 63631 - Posted: 9 Mar 2021, 18:15:21 UTC - in response to Message 63577.  

Edit: #897 that has been poured into the hopper will throw a spanner in the works and slow down the rate they go out to Linux machines though I don't know enough to work out by how much.
I have been running three 897 and one 898 on four cores of a Ryzen 3600, with QuChemPedia on the other eight cores.
They have been running for 2 1/2 days with no trickles, and an estimated total time of 15 days.

This is longer than the previous batch (891), which ran for 9 1/2 days under comparable conditions.
So either they are bigger work units, or else take up more cache are are slowing down the processing.

Do you know which?
ID: 63631 · Report as offensive
Previous · 1 . . . 59 · 60 · 61 · 62 · 63 · 64 · 65 . . . 91 · Next

Message boards : Number crunching : New work Discussion

©2024 cpdn.org