Message boards : Number crunching : New work discussion - 2
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 42 · Next
Author | Message |
---|---|
Send message Joined: 6 Jul 06 Posts: 147 Credit: 3,615,496 RAC: 420 |
Thanks, that is what I thought, possibly the reason I could not get access before as well. Thanks Conan |
Send message Joined: 12 Apr 21 Posts: 318 Credit: 14,976,910 RAC: 9,985 |
It's also quite oversubscribed, I rarely get dev tasks. There's also no credit and the risk of getting misconfigired workunits that can disrupt the client (eg wrong memory settings) Sounds like the main site in most ways. Except on the main site there's a guarantee of getting a ton of misconfigured machines that disrupt the project by ruining tasks. Seems like it may not be a bad idea for CPDN to start doing some house cleaning, even if little by little. :-) |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
I see that OpenIFS 43r3 Baroclinic Lifecycle has appeared as a third OpenIFS task type. Seems like it may not be a bad idea for CPDN to start doing some house cleaning, even if little by little. :-) It used to happen on a regular basis. I suspect the main reason it stopped was it being seen as a lot of extra work for Andy for the amount gained by the project. Pretty sure they wouldn't want to give moderators the power to suspend the guilty machines as it would be difficult to do so without allowing access to so much more. If OpenIFS becomes the dominant model type, the problem should largely disappear at least for the missing libraries issue which is the majority of dodgy computers on the project. |
Send message Joined: 29 Oct 17 Posts: 1051 Credit: 16,636,385 RAC: 11,909 |
I see that OpenIFS 43r3 Baroclinic Lifecycle has appeared as a third OpenIFS task type.Yes, I described this back in message https://www.cpdn.org/forum_thread.php?id=9149&postid=66191 I'm also working with a student at U. Oxford on another customized version of OpenIFS for seasonal forecasts with perturbations to the surface model. The plan is for several batches of ~3000 workunits each, though this will be a while yet as it's still being developed & tested. I'm not sure what the workunit count will be for the lifecycle model. |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,363,583 RAC: 5,022 |
I see that OpenIFS 43r3 Baroclinic Lifecycle has appeared as a third OpenIFS task type.Yes, I described this back in message https://www.cpdn.org/forum_thread.php?id=9149&postid=66191 I'm also working with a student at U. Oxford on another customized version of OpenIFS for seasonal forecasts with perturbations to the surface model. The plan is for several batches of ~3000 workunits each, though this will be a while yet as it's still being developed & tested. I'm not sure what the workunit count will be for the lifecycle model. I don’t know all that much about computers. This OpenIFS stuff will it be open to Windows users or is it just more penguin food. I am running Win10 and 11 on different machines. |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Linux at present, and for some time. ********************** Unix/Linux is a more natural language for programmers of large computers. Trying to hammer it into Windows and still have it work can be tricky sometimes. I followed Microsoft's advice years ago when they were trying to get rid of Windows XP: "Upgrade to a newer OS, and if necessary, newer hardware that will run it." So I upgraded the hardware to a new cpu type, and the OS to Linux. Best advice that I've ever had from them. |
Send message Joined: 12 Apr 21 Posts: 318 Credit: 14,976,910 RAC: 9,985 |
... or is it just more penguin food. This is great, I'm going to have to start using it. I followed Microsoft's advice years ago when they were trying to get rid of Windows XP: This is pretty good too. :-) CPDN is definitely almost all penguin food. With a little work though, you can get yourself a virtual penguin (or a few) to feed via WSL2, which is part of Windows, or VBox, which needs to be installed and set up separately. Apparently OpenIFS will eventually have a VBox version that'll require a much simpler VBox set up than currently needed. |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
Talking of penguin food, currently running one of a batch of five OpenIFS tasks and a bunch of HADSM4's from testing so things are moving again. |
Send message Joined: 1 Jan 07 Posts: 1061 Credit: 36,748,059 RAC: 5,647 |
I went fishing on the dev site, and got just HADSM4's. Still, it proves I got the connection right. |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
I went fishing on the dev site, and got just HADSM4's. Still, it proves I got the connection right.There were only five of the OpenIFS ones so you needed to get in quick. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
I am running a Penguin (Computer ID 1511241), but the last work unit I got was Task Work unit Sent Reported Status Run time CPU time Credit Application 22222161 12146959 28 Jul 2022, 9:43:21 UTC 30 Jul 2022, 17:28:51 UTC Completed 190,787.75 188,926.80 9,616.92 UK Met Office HadSM4 at N144 resolution v8.02 i686-pc-linux-gnu |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
Not likely to be more till the current testing branch work completes. |
Send message Joined: 29 Oct 17 Posts: 1051 Credit: 16,636,385 RAC: 11,909 |
Not likely to be more till the current testing branch work completes.Yes, and then 1000s are planned ;) |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
Yes, and then 1000s are planned ;)Even bathes of 15K or more tasks go quite quickly if for Windows. Linux only tasks if these arrive before the VM ones for MS will last a bit longer. Sadly the first six of my HADSM4's have all gone down to -ve theta crashes. five more still running. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
I am inclined to think that they will go quickly too, though the limits are probably more on bandwidth than memory for most people I think. But once the word gets out, there will be a lot of people willing to try at least. |
Send message Joined: 29 Oct 17 Posts: 1051 Credit: 16,636,385 RAC: 11,909 |
I am inclined to think that they will go quickly too, though the limits are probably more on bandwidth than memory for most people I think.According to the host page here https://www.cpdn.org/host_stats.php, there are 800 active linux hosts at last count. Assuming that also includes linux in virtualbox and WSL, when upwards of ~5000 linux openifs tasks go out, that's plenty of work. Some of my work will require the higher memory machines. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
Assuming that also includes linux in virtualbox and WSL, when upwards of ~5000 linux openifs tasks go out, that's plenty of work. Some of my work will require the higher memory machines. Very good, I can put two Ryzen 3600's on it, one with 64 GB and the other 128 GB. But they may get into a fight over bandwidth across the Atlantic, which seems to be limited to 10 Mbps for me. I may have to back off to one machine. |
Send message Joined: 29 Oct 17 Posts: 1051 Credit: 16,636,385 RAC: 11,909 |
One of the nice things about OpenIFS (or IFS in general) is that the output format (GRIB) was originally designed to be transmitted over unreliable telephone lines. So it's a highly (lossy) compressed format. This means the size of the output files scales slowly with increasing amount of output, much less than the model's memory requirements scale with model resolution.Assuming that also includes linux in virtualbox and WSL, when upwards of ~5000 linux openifs tasks go out, that's plenty of work. Some of my work will require the higher memory machines.Very good, I can put two Ryzen 3600's on it, one with 64 GB and the other 128 GB. But they may get into a fight over bandwidth across the Atlantic, which seems to be limited to 10 Mbps for me. I may have to back off to one machine. We are often reminded by the CPDN team not to overdo the output, so don't worry, we are very aware of bandwidth restrictions. There is possibly an issue with OpenIFS in boinc that I've noted that I still need to look into. BOINC starts multiple OpenIFS tasks because there are free CPU slots, even though the total memory for the tasks exceeds what's available. When I asked Andy about this, he said the boinc client will monitor memory and suspend the tasks if memory is exceeded. However, when OpenIFS starts it immediately allocates memory for itself (you can watch this happen on the process monitor) and the client doesn't seem to be quick enough to catch multiple OpenIFS tasks hitting 8Gb RAM each and, if you haven't got the RAM, the models will crash. As I say, I still need to do more testing to verify this is what's happening. If it is, we might need to put a health warning on running multiple OpenIFS instances if it's not possible to control this. |
Send message Joined: 22 Feb 06 Posts: 491 Credit: 31,246,235 RAC: 15,489 |
"BOINC starts multiple OpenIFS tasks because there are free CPU slots, even though the total memory for the tasks exceeds what's available. " Can this be overcome by limiting the number of cores available to BOINC before downloading any of the IFS models? Allthough I have a four core CPU the box only has 24Gb of RAM. |
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
"BOINC starts multiple OpenIFS tasks because there are free CPU slots, even though the total memory for the tasks exceeds what's available. " As I understand it, they are going to run two cores per work unit at first, so you will have only two work units running. The memory should be enough. (But yes, if you limit the number of BOINC CPU cores, then that will limit the number of work units running.) |
©2024 cpdn.org