Message boards : Number crunching : processors, memory, performance and heat.
Message board moderation
Author | Message |
---|---|
Send message Joined: 28 Feb 05 Posts: 20 Credit: 11,483,891 RAC: 17,119 |
Unless its causing WU crashes, it really helps with controlling heat issues on some rigs while still using all cores. Use at most x% of cpu time should be removed from boinc source code |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
Unless its causing WU crashes, it really helps with controlling heat issues on some rigs while still using all cores.I get that. I just think reducing the number of cores in use is a better option. |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
I just think reducing the number of cores in use is a better option. That also helps with improving cache-per-task, which can make a big difference in per-task performance. Though on most systems, if you're having thermal problems, the right answer is to tweak the power limit settings in the BIOS or with some mainboard utilities. You can clamp that down and not worry about what's loading up the cores, and you usually get a pretty nice boost in compute-per-watt. |
Send message Joined: 28 Jul 19 Posts: 150 Credit: 12,830,559 RAC: 228 |
Unless its causing WU crashes, it really helps with controlling heat issues on some rigs while still using all cores. At the expense of thermally stressing the CPU which does not happen if you use less cores for 100% of the time. |
Send message Joined: 29 Oct 17 Posts: 1051 Credit: 16,656,265 RAC: 10,640 |
At the expense of thermally stressing the CPU which does not happen if you use less cores for 100% of the time.That will have a bigger impact on the throughput (tasks completed per day). A 10% drop in CPU use on all cores is all I need on my older machines to get temps I'm happy with. Taking 1 core away from 4 is the same as a 25% CPU use reduction. I prefer having the finer control of %cpu available. |
Send message Joined: 22 Feb 11 Posts: 32 Credit: 226,546 RAC: 4,080 |
What about dropping cpu frequency 10% with Ryzen master, amd overdrive for older cpus or Intel Extreme Tuning Utility? |
Send message Joined: 28 Jul 19 Posts: 150 Credit: 12,830,559 RAC: 228 |
At the expense of thermally stressing the CPU which does not happen if you use less cores for 100% of the time.That will have a bigger impact on the throughput (tasks completed per day). A 10% drop in CPU use on all cores is all I need on my older machines to get temps I'm happy with. Taking 1 core away from 4 is the same as a 25% CPU use reduction. I prefer having the finer control of %cpu available. If you’re happy with the reduced life of the CPU then fine, just be aware that you are making that choice. |
Send message Joined: 29 Oct 17 Posts: 1051 Credit: 16,656,265 RAC: 10,640 |
These are old intel chips that are already 10yrs old. I'm not worried. Intel's newer CPUs are designed to run hotter than AMD too. |
Send message Joined: 16 Aug 04 Posts: 156 Credit: 9,035,872 RAC: 2,928 |
Yeah, a new thread on core speed, s/TS, versus watts used on different chips would be fun to see. Don't think my little undervolted AMD 5600G 6-core with only 16 Mb L3-cache is so bad against the bigger ones. With 5 cores using 130W it takes around 7 days for wah2 8.29 or 4.37 kWh/task Running 2-3-4 cores is faster but not much. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
With 5 cores using 130W it takes around 7 days for wah2 8.29 or 4.37 kWh/task My main (Linux) machine is consuming 275 watts and running 13 Boinc processes. (None of them ClimatePrediction). The 275 watts includes the computer, the router, and the monitor. ID: 1511241 Number of processors 16 Memory 125.07 GB Cache 16896 KB Swap space 15.62 GB Total disk space 488.04 GB Free Disk Space 480.47 GB Measured floating point speed 5.92 billion ops/sec Measured integer speed 23.22 billion ops/sec Average upload rate 194.32 KB/sec Average download rate 15613.09 KB/sec Average turnaround time 7.96 days Every 11.0s: sensors localhost.localdomain: Sat Mar 2 13:33:14 2024 coretemp-isa-0000 Adapter: ISA adapter Package id 0: +75.0°C (high = +88.0°C, crit = +98.0°C) Core 8: +68.0°C (high = +88.0°C, crit = +98.0°C) Core 2: +66.0°C (high = +88.0°C, crit = +98.0°C) Core 3: +71.0°C (high = +88.0°C, crit = +98.0°C) Core 5: +70.0°C (high = +88.0°C, crit = +98.0°C) Core 1: +75.0°C (high = +88.0°C, crit = +98.0°C) Core 9: +74.0°C (high = +88.0°C, crit = +98.0°C) Core 11: +67.0°C (high = +88.0°C, crit = +98.0°C) Core 12: +65.0°C (high = +88.0°C, crit = +98.0°C) |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
Please can new posts on this be put in here rather than in the New work thread. Thank you. |
Send message Joined: 29 Oct 17 Posts: 1051 Credit: 16,656,265 RAC: 10,640 |
Some performance numbers for the WaH batch 1006 in the database. For the top 10 fastest run tasks we have:
-- Next 2, took CPU time of 1.47 days. As the user has the computer hidden I won't post the details. -- Next 2, CPU time of 1.67 days on a 12th gen Intel i9-12900K. United States. -- Next 5, CPU times ranging from 1.9-2.1 days on a 12th gen Intel i7-12700H. Canada. We also record the 'completion time', the time from when the host computer received the task to when the result came back to CPDN. The fastest completion time was 5.2 days. Median completion time is currently 15 days. Median cpu time is ~12 days. --- CPDN Visiting Scientist |
Send message Joined: 12 Apr 21 Posts: 318 Credit: 15,000,104 RAC: 9,568 |
Looking around, seems like i9-13900K is near the top of single-core performance. Multi-core too (outside of Threadrippers) but this doesn't make a difference for CPDN. Seems like a good chip. |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
Might have to get a new box sometime. My Ryzen 7 3700X was fast when I got it. |
Send message Joined: 29 Oct 17 Posts: 1051 Credit: 16,656,265 RAC: 10,640 |
You're on the right forum if you want some encouragement! |
Send message Joined: 7 Sep 16 Posts: 262 Credit: 34,915,412 RAC: 16,463 |
I've got a pair of 3900Xs that do most of my computation (12C/24T), and I've found that I see almost no "net system throughput" improvements between 8 and 12 threads running with CPDN tasks - it may be marginally faster at 12, but not by much (mine are typically retiring 50-60G instructions per second when loaded). Going up past 12 actually reduces net system throughput. I think turbo might increase that slightly, but I generally keep it disabled to avoid the corner of "tons of extra power for a slight bit extra performance." There doesn't seem to be any benefit to hyperthreading with CPDN tasks (making sense, they're floating point/vector engine heavy), and they seem to prefer "enough cache" - though I think there's still a ton spilling to main DRAM, based on counters on my Intel boxes. I don't care a bit about single threaded speed for CPDN, just total system throughput. But Dave clearly needs some test chips! ;) |
Send message Joined: 15 May 09 Posts: 4541 Credit: 19,039,635 RAC: 18,944 |
I do wonder if faster RAM might help. Potentially I might need more than 32GB for some testing with OIFS even if on main site they are rationed to avoid problems with machines that don't have enough for multiple tasks. |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
I do wonder if faster RAM might help. Potentially I might need more than 32GB for some testing with OIFS even if on main site they are rationed to avoid problems with machines that don't have enough for multiple tasks. My machine has this memory at the moment. CPU type Genuine Intel - Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz [Family 6 Model 85 Stepping 7] Number of processors 16 Operating System Red Hat Enterprise Linux 8.9 (Ootpa) [4.18.0-513.18.1.el8_9.x86_64|libc 2.28] BOINC version 7.20.2 Memory 125.07 GB [2933MHz DDR4] Cache 16896 KB It came with 32 GBytes but I doubled it a couple of times as prices for RAM came down. I guess it is no longer state-of-the-art (if it ever was), but it is several years old now, so there must surely be faster machines out there now. I cannot put faster RAM in there, but I could run it up to 512 GBytes if someone would send me the money to do it. I doubt there is much point to doing that, since my L3 cache is 16384 Kbytes, which is pretty good for that kind of processor chip, I got all that RAM to run all those OIFS tasks that I have not received since last June, IIRC. |
Send message Joined: 29 Oct 17 Posts: 1051 Credit: 16,656,265 RAC: 10,640 |
I doubt it, cpu speed is what makes the biggest difference. I like the intel chips currently; more cores than Ryzen for same/less money, good single core performance, better memory latency than Ryzen. The mid-range i5 (or i7) is good value for money. Chip cache won't make much difference because the code is not optimized for specific cache sizes. Plus DDR5. I do wonder if faster RAM might help. Potentially I might need more than 32GB for some testing with OIFS even if on main site they are rationed to avoid problems with machines that don't have enough for multiple tasks. --- CPDN Visiting Scientist |
Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,915 RAC: 2,154 |
Do Linux users know about this interesting tool? # perf stat -e cache-references,cache-misses,cycles,instructions,branches,faults ^C Performance counter stats for 'system wide': 4,751,265,017 cache-references 1,957,008,106 cache-misses # 41.189 % of all cache refs 1,416,865,456,289 cycles 1,984,715,137,591 instructions # 1.40 insn per cycle 273,726,331,297 branches 50,751 faults 25.357650625 seconds time elapsed You start the perf program with the first line. When you think it has run long enough, you hit Control C. It then prints the results. The machine was doing this; i.e., mostly Boinc work -- 13 boinc tasks top - 17:56:27 up 11 days, 4:21, 2 users, load average: 13.58, 13.52, 13.51 Tasks: 483 total, 14 running, 469 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.4 us, 0.1 sy, 80.6 ni, 18.6 id, 0.0 wa, 0.2 hi, 0.0 si, 0.0 st MiB Mem : 128074.1 total, 2100.0 free, 6385.8 used, 119588.3 buff/cache MiB Swap: 15992.0 total, 15947.2 free, 44.8 used. 118485.6 avail Mem My actual results here are probably of no interest to readers here because none of the Boinc tasks were running any CPDN tasks. But if I ever get more, I will be able to see how they do. With that work load on my machine, a little over half the memory references were satisfied by the cache. |
©2024 cpdn.org