Posts by old_user714979

1) Questions and Answers : Wish list : Using GPUs for number crunching (Message 48770) Posted 11 Apr 2014 by old_user714979 Post: Last work unit completed. I'm moving to another BOINC project and won't be following this thread. CYA..
2) Questions and Answers : Wish list : Using GPUs for number crunching (Message 48769) Posted 11 Apr 2014 by old_user714979 Post: I am sure you are right that this topic will keep being revisited but until the ability to cope with 80bit numbers appears on GPUs, my understanding is that it just isn't worth starting on. Once it is widely available, then it may well be worth working on but only if someone comes up with enough money to fund the work. Double precision started appearing in AMD GPUs in 2008 and they all stress IEEE 754 compliance in their marketing hype. Now double precision is in every GPU and the discussion has moved onto performance differences between Nvidia and AMD, with the cost/power/performance trade-offs between different GPUs. Programmers still have to know the hardware: https://developer.nvidia.com/sites/default/files/akamai/cuda/files/NVIDIA-CUDA-Floating-Point.pdf You have suggested having donations specifically for this but as these would almost certainly take away from the donations the project currently receives I think this is unlikely. The only ways I can see it happening is if a group of volunteer programmers with the skills and time plus hardware resources wish to take it on or someone taking it on as a masters/PHD project. Think there was a comment in this thread from a moderator that the Met doesn't release its code which would eliminate volunteer programmer teams. I cannot know if the task is too large or too small for a single Piled Higher and Deeper student or that a code optimization for a different processor is a suitable thesis topic.
3) Questions and Answers : Wish list : Using GPUs for number crunching (Message 48751) Posted 10 Apr 2014 by old_user714979 Post: You STILL haven't got the point. The Met Office models are a global standard. They're run by researchers in a lot of places around the globe, who know that these models are stable, and that they can compare results with other people using them. The wish list request here is not to change the model but to optimize code running on particular hardware. The Unified Model is not running on the same supercomputer hardware at all the different sites around the world plus the FORTRAN compilers would introduce low level optimizations choices appropriate for the hardware. Hopefully these optimizations would not affect the accuracy of the results even when changing for example the execution order of instructions. Seem to remember even in my CDC 6400/6600 SCOPE/KRONOS days FORTRAN optimizations for hardware could exist without invalidating results. I'd expect the Met Office would see little practical benefit in resourcing a team to produce code optimized for execution on GPUs. In effect they control the purse strings and it is their cost/benefit considerations that determine priority. The Unified Model code is not static as improvements are incorporated over time but if the cost of developing or maintaining GPU code exceeds any benefits to them then it is pointless. Every researcher currently using these models would have to switch to GPU programs and start again with their testing to see if they get consistent results. A request to optimize code is not a request to change the model therefore no requirement for researchers to switch programs. Of course if the new code was faster than the existing code there might be an incentive to upgrade. Why bother when the current system works? And if they don't change, then we don't. Viewpoints are different. On one side there is access to a resource of wasted CPU instruction cycles that could be utilized for climate studies while on the other side there are people who are willing to donate their wasted CPU and GPU cycles for a good purpose. As a person donating CPU and GPU cycles I wish to donate all those cycles not just a fraction. Other BOINC projects will fully utilize spare CPU and GPU processor cycles therefore I perceive a higher benefit in donating CPU and GPU cycles to one of those projects. I'm guessing people will keep revisiting this wish list topic.
4) Questions and Answers : Wish list : Using GPUs for number crunching (Message 48750) Posted 10 Apr 2014 by old_user714979 Post: This issue just won�t stay dead. Every time that you think it have been safely staked through the heart it rises from the grave like Dracula in one of those old Hammer movies. I haven't seen it staked through the heart, not even a flesh wound so I expect it will keep coming back each year as a wish list item. Floating point double precision didn't exist in hardware on the early GPUs and there must have been a time when there wasn't FORTRAN compiler support for the GPU hardware. To see postings saying single precision is not adequate or FORTRAN compilers don't exist that support GPU hardware doesn't inflict a flesh wound. There are real issues with GPU hardware rounding, FORTRAN compilers and their extensions, etc. It appears from these wish list threads that people give reasons not to investigate GPU processing and appearing to say nobody has actually attempted to investigate BOINC GPU processing and failed due to a GPU hardware limitation. My guess is the expensive of optimizing then testing the code for parallel GPU processing is the killer, impossible with the available resources, and unlikely in the foreseeable future. The aim has to be the same results as the existing BOINC work units and not a new model. Throw a lot of money, probably a vast pile of money and programmers at the problem and BOINC GPU processing is possible. That is a flesh wound that won't stop people revisiting this as a wish list item.
5) Questions and Answers : Wish list : Using GPUs for number crunching (Message 48748) Posted 10 Apr 2014 by old_user714979 Post: Hi Volvo The CPDN models come from the UK Met Office where they consist of a version of the Unified Model. CPDN then adapts these models for its own use, for example deciding on the precise parameter values for particular experiments and compiling the models for the three platforms: Windows, Linux and Mac. Further CPDN adaptations can consist of time-slicing long models so that different computers take on different sections, and they all have to be stitched together. But they all still consist of the Unified Model which the Met Office has adapted and developed continuously for years. The Met Office has a team of developers working on this, just as at the small number of other institutions that have developed models. I've seen a list of the names of one of these teams; it filled a computer screen. I also know that these organisations employ ace programmers. GPUs and weather modeling is not new. Here are some random samples from Google: http://www.mmm.ucar.edu/wrf/WG2/michalakes_lspp.pdf http://www.nvidia.com/content/PDF/sc_2010/theater/Govett_SC10.pdf http://data1.gfdl.noaa.gov/multi-core/2011/presentations/Govett_Successes%20for%20Challenges%20Using%20GPUs%20for%20Weather%20and%20Climate%20Models.pdf Being an ace FORTRAN programmer does not instantly grant any detailed knowledge about programming GPU hardware. If you said there was a dedicated team researching GPU solutions then I would concede they have the knowledge. Finding somebody who has played with GPUs at home does not translate into that person having any influence over future directions of a corporate programming project. To my knowledge these organisations all run their models on CPUs, in some cases on supercomputers. For example, a supercomputer in Tokyo is used for this purpose. If using GPUs were possible for the type of calculations required for climate models I'm pretty sure that all these model programmers in several institutions would already have harnessed this possibility. They have every motivation to complete model runs as quickly as possible because similar models based on the UM are used for weather prediction, for which they also run ensembles, albeit much smaller than ours at CPDN. Supercomputers can be built with GPUs e.g.:http://en.wikipedia.org/wiki/Titan_(supercomputer) or http://en.wikipedia.org/wiki/Tianhe-I and used for climate modelling. CPDN has two programmers who do not design the UM which runs on CPU. I wouldn't expect these two programmers to design the model but I expect they would be considered experts in implementing stable code in the BOINC environment. If they have researched GPU processing and say it can never be done due to design limitations of the hardware platform (e.g. rounding errors) then end of story. If the BOINC programming team hasn't made that evaluation or extra programming staff would be required to implementation GPU processing then it is a financial problem not technical. Maybe the only option is to ask for donations specifically to investigate GPU processing. We are all aware that running research tasks on computers uses electricity and that we need to ensure that our computers run as efficiently as possible. One way we can reduce the carbon footprint is be ensuring that as few models crash as possible. As a Problem Manager (ITIL) in a large Telco I've seen a few train wrecks by teams and their programmers. Must admit the spectacular crashes do get attention. I tend to take notice when one of the earlier posts here said The project's servers are already struggling to cope with the huge amounts of data being returned. Why do you want to increase this so drastically?
6) Questions and Answers : Wish list : Using GPUs for number crunching (Message 48745) Posted 10 Apr 2014 by old_user714979 Post: Well, AFAIK all currently active climate-models uses SSE2-optimizations, and my guess this means they're using double-precision. Since the fortran-compiler linked a few posts back is CUDA, and Nvidia-cards has abyssimally poor double-precision-speed of only 1/24 single-precision-performance, except if you pays $$$$ for the professional cards, even a top-end Nvidia-GTX780Ti only manages 210 GFLOPS at most. A quad-core (8 with HT) cpu on the other hand is around 100 GFLOPS. Meaning even best-case the Nvidia-GPU will only be 2x faster than CPU. In reality even 50% performance on GPU can be too high, meaning your "slow" CPU is outperforming your "fast" GPU. So, unless can use single-precision on most of the calculations, a CUDA-version of CPDN is a waste of development-time. Instead of CUDA, an OpenCL-compiler would be more interesting, since OpenCL also works with the much faster Amd-GPU's. But even with this additional speed, it's still unlikely can get a climate-model to run faster on GPU than CPU. I'm actually moving from the Nvidia card and once prices settle will probably try for an R9 280X as a compromise between my wish list and what I can afford. Hopefully an E3-1275V3 is to be matched with my new Asus P9D-WS motherboard (in theory supports 4x Crossfire, ECC, RAID, etc.). In the distant future when the price of these cards crash on eBay I'l probably move to 2x or 3x Crossfire to stretch the life of the system or sooner if I need a performance upgrade. This is practically trailing edge hardware for a gamer but as an older gamer I'm looking for reliability over performance and have more requirements than pure gaming. Although I'm not into coin mining. The R9 280X is 1/4 for DP. The AMD trade off is heat for performance. This is the point where the efficiency of an ~90W Xeon Quad core running 8x work units against a GPU could begin with a comparison of Performance/Power ratio. Will any GPUs be more efficient than the CPU? As I've yet to buy the graphics card there is some flexibility but 1 GByte cards are not going to be acceptable. I was aware of the performance problems of the Nvidia line but it gave me some crude numbers to show CPU vs GPU performance. Plus there was a commercial FORTRAN compiler available. Some comments in these GPU related threads seem concerned FORTRAN was not available to support GPU hardware.
7) Questions and Answers : Wish list : Using GPUs for number crunching (Message 48737) Posted 8 Apr 2014 by old_user714979 Post: Some people leave their PCs on just to complete work units and this may have consequences. The Wikipedia entry for our local brown coal burning power station called Hazelwood says it was listed as the least carbon efficient power station in the OECD in a 2005 report by WWF Australia. Guess you can argue if this is valid or not, but the issue is a work unit completed on brown coal power is probably going to do more damage to the environment than a work unit completed on a green energy source. Using inefficient code means work units take longer to complete which increases the quantity of Carbon released into the atmosphere. In real terms one PC is a drop in the ocean. It may become a perception issue when considering the carbon released by the PC base over the lifetime of a climate prediction project. Based on the current Task info for SETI@Home's BOINC client project my video cards NVIDIA GPU completes a work unit in about 17 minutes while the Intel CPU work unit will take just under 3 Hours on a similar task. As a FORTRAN compiler with GPU extensions already exists the obvious first step is to compile the existing FORTRAN code without using any GPU extensions for the BOINC client and see if anything breaks. Maybe the compiler will have a meltdown, maybe it will work flawlessly. No point getting exited about GPU processing if the tools are not up to the job. Assuming no major problems with the compiler the next step is professional development of your programming staff with training on GPUs. At this point you might be in a position to know if GPU processing is a reasonable option. Cost is a compiler, some days of professional development then decide if GPU processing is an option. Floating Point rounding, guard digits, etc are significant issues that can't be assessed without knowledge of the GPUs and software. I'll do my bit to be more efficient by not getting more climateprediction.net tasks and I'll concentrate on efficient BOINC projects that will use GPUs.