Message boards :
Number crunching :
New work Discussion
Message board moderation
Previous · 1 . . . 87 · 88 · 89 · 90 · 91 · Next
Author | Message |
---|---|
Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653 |
My usual problem is dropped connections on short files - they can't handle the concurrency. I changed these in my cc_conif.xml: <max_file_xfers>1</max_file_xfers> <max_file_xfers_per_project>1</max_file_xfers_per_project> They are normally set to 8 and 4 respectively. It fixed my last 7 stuck downloads. Thanks for the tip. EDIT: "2" seems to work OK also. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Good idea, I'll reduce mine to 8 and 1. There's no point me trying to get 8 at once when most projects can give me files almost as fast as my fibre, and those that can't are going to get overloaded by asking for several at once. But I'll leave the first figure on 8 incase one project is slow so others can get through.My usual problem is dropped connections on short files - they can't handle the concurrency. |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
OpenIFS 43r3 ARM There are ITX boards available on which you can mount multiple Raspberry Pi 4B compute models and some boards have a GPU slot also available. Each Raspberry Pi 4B has an 8GB RAM flavour available. These Pi4B compute models are cheap and powerful with four cores. Windows for ARM is a no-go, so Linux it will be. My question is, can we run OpenIFS 43r3 ARM on a cluster of these? https://hackaday.com/2021/11/28/this-raspberry-pi-mini-itx-board-has-tons-of-io/ |
Send message Joined: 29 Oct 17 Posts: 1036 Credit: 16,134,123 RAC: 12,670 |
OpenIFS 43r3 ARMOpenIFS has already been run on Pi's. See: https://www.ecmwf.int/en/about/media-centre/science-blog/2019/weather-forecasts-openifs-home-made-supercomputer . I helped Sam set this up, it was a great demonstrator that he took to science fairs. ECMWF gave him a job after he finished at uni. I appreciate you're talking about something different but it demonstrates Pis will run the model. How do the multiple Pi present themselves? If you had 2 Pi's would the system see 8 cores with a total of 16Gb RAM shared between them? Or would it see 2 separate compute nodes with only 8Gb addressable by each Pi? I ask because (a) OpenIFS needs a total 16Gb minimum to do anything useful; (b) although OpenIFS supports both MPI & OpenMP parallelism, I stripped out MPI to reduce the memory footprint. In the article above, we used MPI to communicate across ethernet between the Pi's, as OpenMP needs shared memory. But CPDN only use the shared memory option in OpenIFS. So, yes, I'm sure the system would run OpenIFS (maybe with a bit of hacking), but not for any useful work in CPDN. |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,765,484 RAC: 11,446 |
How do the multiple Pi present themselves? If you had 2 Pi's would the system see 8 cores with a total of 16Gb RAM shared between them? Or would it see 2 separate compute nodes with only 8Gb addressable by each Pi? They would appear as two entirely separate Pi4s, each with 4C/8GB. |
Send message Joined: 6 Oct 06 Posts: 204 Credit: 7,608,986 RAC: 0 |
There would be teething troubles but Open IFS for ARM, if you see the applications page of CPDN. Which ARM are they talking about? The only ARMs I know of are Single Board Computers. |
Send message Joined: 15 May 09 Posts: 4523 Credit: 18,535,580 RAC: 7,828 |
With the ITX boards that can take multiple Pi4B boards on them, my guess is it would work a bit like a render farm. I assume it would need one core from one of the Pis to manage how the work is spread around the rest. In the short term however, I don't see the advantage as for the same price as one of the boards you can mount several 4Bs boards on, you can currently get more power from a Ryzen or Intel solution. |
Send message Joined: 29 Oct 17 Posts: 1036 Credit: 16,134,123 RAC: 12,670 |
There would be teething troubles but Open IFS for ARM, if you see the applications page of CPDN. Which ARM are they talking about? The only ARMs I know of are Single Board Computers.The hardware isn't a problem. OpenIFS has been run on multiple ARM platforms. For example, the UK Isambard system (see: https://www.archer.ac.uk/training/virtual/2018-12-12-Isambard/archer-webinar-dec-2018.pdf).The compiler can be more of an issue sometimes if it doesn't support some of the modern fortran features the model uses, but usually it's just a case of tuning the model code to work efficiently with processor cache sizes etc. I'm not sure how much I'm allowed to say about the AMD reference on the CPDN Applications page but again, there was no issue applying the model to this hardware. In order to use the ITX board with multiple Raspberry Pi's we'd need the full OpenIFS model code with MPI+OpenMPI rather than the OpenMP-only version that CPDN use; it would work. But, as I said, the available memory on the Pi's would limit the model to nothing more than a demonstrator and not sufficient for the work that CPDN needs. It's cheap hardware, great for certain applications not for running weather models. |
Send message Joined: 6 Aug 04 Posts: 195 Credit: 28,139,303 RAC: 10,239 |
That's for another forum. I've posted about it at WCG.Off topic ... spent some time today resetting config.xml and manually getting WCG to download Africa rainfall tasks If that's the necessary workload, I'll give it a miss. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
I've got 90 CPU cores and 12 GPUs and 3 Android phones running WCG flat out. I find it fun pestering the server. It makes Boinc more involved. The blast of heat as I walk past a garage window is absurd. I'm either going to solve every world problem, or use up all the electricity :-)That's for another forum. I've posted about it at WCG.Off topic ... spent some time today resetting config.xml and manually getting WCG to download Africa rainfall tasks If that's the necessary workload, I'll give it a miss. |
Send message Joined: 5 Aug 04 Posts: 1118 Credit: 17,177,237 RAC: 2,478 |
Would be better if they were zipped into a smaller number of larger files surely. I think many of the files are the same between different issuances of of work units, where other files may just be the initial conditions that vary more often. So to send them as a zip file would require zipping every work unit, including the initial conditions, before sending. May be too much trouble (and load) on the server. I would be more interested in getting work units for ClimatePrediction. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
I think many of the files are the same between different issuances of of work units, where other files may just be the initial conditions that vary more often. So to send them as a zip file would require zipping every work unit, including the initial conditions, before sending. May be too much trouble (and load) on the server.I'm seeing huge numbers of tiny files for GPU Covid. I've done thousands of those tasks now, and they still send loads of files. So either they are different, or the server isn't acknowledging I already have that file. With most projects, if I come back after a while, or every so often, there's a big dataset that gets downloaded just once, then the task files are smaller. |
Send message Joined: 1 Jan 07 Posts: 1058 Credit: 36,475,323 RAC: 12,710 |
I'm seeing huge numbers of tiny files for GPU Covid. I've done thousands of those tasks now, and they still send loads of files. So either they are different, or the server isn't acknowledging I already have that file. With most projects, if I come back after a while, or every so often, there's a big dataset that gets downloaded just once, then the task files are smaller.Only one project: Einstein@home. It's called "locality scheduling", and their server was specially enhanced by their boss, Bruce Allen. Everybody else does it their own way. Technical detail: that only applies to the Gravity Wave search, using data from the LIGO detectors. Those are large, exquisitely detailed, datasets, recorded by the 4 km laser interferometers. Thousands of individual workunits are created to scan them every which way possible. You don't do that with nano-scale protein molecules. They are, indeed, all different. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
Only one project: Einstein@home. It's called "locality scheduling", and their server was specially enhanced by their boss, Bruce Allen. Everybody else does it their own way.I'm sure I've seen it elsewhere, LHC for example. But yes i guess with virus research those files are always different. Zipping might help, but is probably no easier than fixing the duff network switch or whatever the 14 billion dollar company can't afford to replace! I'm assuming not many people are able to get so many GPU workunits, since i've managed to get from 10,290th to 8,766th place in 1 day. About 12 Tahiti-grade AMD GPUs running it 24/7, 4Tflop rating each. |
Send message Joined: 29 Oct 17 Posts: 1036 Credit: 16,134,123 RAC: 12,670 |
It helps the server, not hinders it. CPDN works by zipping the files before loading onto the server - saves storage on the server, reduces no. of connections from clients and reduces download time because the total download size is now less due to compression. The 'zipping' is done by the scientist. That includes invariant files that are always needed for every experiment, plus files that vary per experiment such as initial conditions. It's a nobrainer really. The client unzips before starting the task.Would be better if they were zipped into a smaller number of larger files surely.I think many of the files are the same between different issuances of work units, where other files may just be the initial conditions that vary more often. So to send them as a zip file would require zipping every work unit, including the initial conditions, before sending. May be too much trouble (and load) on the server. I would be more interested in getting work units for ClimatePrediction.They are coming. I'm in Oxford next week to chat to the team about setting up tests for the higher resolution multicore jobs. But since I'm retired and not getting paid, I'll go at my own pace. Though keen to demonstrate the capability that might get (younger) scientists interested in using the platform. After that I plan to work on implementing OpenIFS in VMs for the Windows & mac platforms, plus few other things. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
After that I plan to work on implementing OpenIFS in VMs for the Windows & mac platforms, plus few other things.This pleases me. I take it Windows will then run a Virtualbox job much like LHC? |
Send message Joined: 15 May 09 Posts: 4523 Credit: 18,535,580 RAC: 7,828 |
This pleases me. I take it Windows will then run a Virtualbox job much like LHC?That is the plan. |
Send message Joined: 9 Oct 20 Posts: 690 Credit: 4,391,754 RAC: 6,918 |
And with the much higher resolution, there will plenty of work for all :-)This pleases me. I take it Windows will then run a Virtualbox job much like LHC?That is the plan. Glenn really should get paid for this. When I worked in a university, we found grant money to pay for wages to hire folk like that. |
Send message Joined: 15 May 09 Posts: 4523 Credit: 18,535,580 RAC: 7,828 |
More of the HADCM3S in testing. Still no clue as to how long before these mean more work for Macs or of the time scale for any other new work. :( |
Send message Joined: 5 Aug 04 Posts: 1118 Credit: 17,177,237 RAC: 2,478 |
I got no ClimatePrediction tasks for my Linux Machine in August even though it was up the whole time. top - 09:38:26 up 30 days, 6 min, 1 user, load average: 8.09, 8.18, 8.21 Tasks: 456 total, 9 running, 447 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.6 us, 5.6 sy, 44.2 ni, 49.6 id, 0.0 wa, 0.1 hi, 0.0 si, 0.0 st MiB Mem : 63772.8 total, 553.5 free, 5004.2 used, 58215.0 buff/cache MiB Swap: 15992.0 total, 15240.0 free, 752.0 used. 57826.2 avail Mem |
©2024 cpdn.org