climateprediction.net home page
Is HyperThreading BAD for Climate?

Is HyperThreading BAD for Climate?

Message boards : Number crunching : Is HyperThreading BAD for Climate?
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user98407

Send message
Joined: 15 Sep 05
Posts: 8
Credit: 205,423
RAC: 0
Message 16172 - Posted: 22 Sep 2005, 10:55:33 UTC

There seems to be a lot of cache misses and page faults when I run Climate BOINC on a two physical Xeon configuration with HyperThreading on. The smaller projects, like SETI and Protein, seem to do just fine with HyperThreading, but has anyone confirmed that Climate might do better with HyperThreading off?

ID: 16172 · Report as offensive     Reply Quote
old_user2354

Send message
Joined: 28 Aug 04
Posts: 13
Credit: 767,708
RAC: 0
Message 16173 - Posted: 22 Sep 2005, 12:57:31 UTC - in response to Message 16172.  

There seems to be a lot of cache misses and page faults when I run Climate BOINC on a two physical Xeon configuration with HyperThreading on. The smaller projects, like SETI and Protein, seem to do just fine with HyperThreading, but has anyone confirmed that Climate might do better with HyperThreading off?


Well, I know that two climate WUs on my P4 HT System are quite bad. I think it might be because the two programs are trying to use the same \'parts\' of the CPU and thus in a bottleneck. With enough other projects it\'s quite easy to have climate together with some other work.

Just a non-professional opinion
ID: 16173 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 16174 - Posted: 22 Sep 2005, 13:05:59 UTC

Most estimates I have seen come up with a 15 to 20 % throughput improvement with HT on. (Each model runs slower but not twice as slow.)
Visit BOINC WIKI for help

And join BOINC Synergy for all the news in one place.
ID: 16174 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 16177 - Posted: 22 Sep 2005, 13:56:02 UTC

I agree with crandles on this. But I don\'t have a Xeon, just a P4.

ID: 16177 · Report as offensive     Reply Quote
old_user98407

Send message
Joined: 15 Sep 05
Posts: 8
Credit: 205,423
RAC: 0
Message 16385 - Posted: 2 Oct 2005, 14:25:55 UTC - in response to Message 16177.  
Last modified: 2 Oct 2005, 14:26:32 UTC

I agree with crandles on this. But I don\'t have a Xeon, just a P4.


Be aware that a computer configured with HyperThreading ON, but BOINC limited to the number of physical processors will waste half of its time looking for work for the logical processors and only contribute one-half of its capacity to BOINC-based projects.

For example, on a two-physical CPU machine with HyperThreading ON, but BOINC limited to two (2) CPUs,you will have two BOINC-based processes each using 25% of CPU capacity and Windows XP Pro will use 2% for the Task Manager and 48% running the SystemIdle loop, which actually consumes resources.

My advise is to configure a two physical CPU machine with HyperThreading OFF (to see faster progress) or with HyperThreading ON and matching the logical CPU count in BOINC (to get slightly more ultimate processing completed).

Of course, on projects other than ClimatePrediction,you will see 30-35% more throughput with HT ON, so I\'m running that way across the board now.

The default configuration for BOINC seems to be for a single physical HT processor with HT on. That\'s why it is set to two.

ID: 16385 · Report as offensive     Reply Quote
Profile Pooh Bear 27
Avatar

Send message
Joined: 5 Feb 05
Posts: 465
Credit: 1,914,189
RAC: 0
Message 16399 - Posted: 3 Oct 2005, 10:14:36 UTC

I have a P4 HT, running with HT, and have had no issues with 2 CPDN running simutaneously. I have 896M of memory (1G - 128 shared for video). Been running this through several WUs, and never had one fail, yet.

So in my experience, no there is no issue with an HT running 2 CPDN WUs simutaneously.


ID: 16399 · Report as offensive     Reply Quote
old_user98407

Send message
Joined: 15 Sep 05
Posts: 8
Credit: 205,423
RAC: 0
Message 16400 - Posted: 3 Oct 2005, 10:33:43 UTC - in response to Message 16399.  

... and never had one fail, yet.


In rereading the thread, I don\'t see failures mentioned, but when mixing the large and small models there appears to be some pretty inefficient processing (what with cache issues et al). So much so that I would suspect that running without HT might be faster. But, in the long run, it probably isn\'t that big a deal. Thanks for your post.

ID: 16400 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1067
Credit: 16,546,621
RAC: 2,321
Message 16480 - Posted: 7 Oct 2005, 11:40:57 UTC - in response to Message 16400.  

... and never had one fail, yet.


In rereading the thread, I don\'t see failures mentioned, but when mixing the large and small models there appears to be some pretty inefficient processing (what with cache issues et al). So much so that I would suspect that running without HT might be faster. But, in the long run, it probably isn\'t that big a deal. Thanks for your post.


Hyperthreading performance depends on lots of things. For example, on this machine I have two 3.06GHz hyperthreaded Xeon processors, the model with an L3 cache as well as the traditional L1 and L2 caches. The L3 cache is a megabyte, and the memory is the kind where you need a multiple of two memory modules because they run in parallel. So the memory is better able to keep the L3 cache filled, and the 512 KByte L2 cache gets its instructions and data from the L3, and the L1... .

So if I were running four instances of ClimatePrediction, all the same application program, perhaps a fair amount of the working set of instructions and a bit of data would be in the cache and running pretty fast.

I never tried measuring this with ClimatePrediction, but I tried to get a qualitative handle on it for SetiAtHome. When I ran 1 process, it ran pretty fast, and when I ran 2, it was almost as fast, but when I ran 4, it slowed down about 20%. Note that the total throughput was monotonic increasing (do not know what would happen with a 4-processor hyperthreaded machine: at some point it flattens out and AFAIK, it might even get worse by adding more processes), so the individual proccess were taking longer to complete.

If I also run a database application, the presense of BOINC processes really hurts, and offhand you would not expect this, since I run on Linux and the BOINC stuff runs only when the machine cannot use the processors for other tasks. ANd the process scheduler is working correctly. The trouble is that the dbms starts an IO operation and gets suspended. The lower priority BOINC application is dispatched, dirtys up the cache, and when the dbms IO completes, it gets the processor, but with a dirty cache, so it is dealing with the 533MHz memory instead of the 3.06GHz cache, slowing things down.

Answering the question: is it better to have hyperthreading on or off is a very difficult thing to answer because so many things enter into the calculations.
ID: 16480 · Report as offensive     Reply Quote
old_user98407

Send message
Joined: 15 Sep 05
Posts: 8
Credit: 205,423
RAC: 0
Message 16483 - Posted: 7 Oct 2005, 23:19:40 UTC - in response to Message 16480.  


Answering the question: is it better to have hyperthreading on or off is a very difficult thing to answer because so many things enter into the calculations.

AGREED. Once I noticed that the HT machine would SystemIdle the logical processors and, therefore, not fully utilize the CPUs, I went back to four processes on four logical processors. The room temp is up a bit, but the overall throughput is better.

ID: 16483 · Report as offensive     Reply Quote
Profile Pooh Bear 27
Avatar

Send message
Joined: 5 Feb 05
Posts: 465
Credit: 1,914,189
RAC: 0
Message 16484 - Posted: 8 Oct 2005, 14:08:43 UTC

I recently bought a P4 2.8G non-HT, and I have a P4 2.8 HT. The non-HT unit is out performing CPDN by over double. Both are doing the same projects with the same percentages, and the non will do one WU in 23.5 days, so 2 in 47 days. It take the other to do 1 in 53 days (cause I do other projects, so it is not running 2 simutaneously, but when it was running 2, it still take 53 days). There is a difference, the HT is a laptop, and does run hotter than the desktop. All the other projects are close to 2-1. My thought is the L2 Cache is the reason. I have not turned off HT on the laptop to test, but after seeing this for over a week, I might shut it off and see if I get a better performance hit, and a little less heat.


ID: 16484 · Report as offensive     Reply Quote
Profile old_user5994

Send message
Joined: 31 Aug 04
Posts: 239
Credit: 2,933,299
RAC: 0
Message 16485 - Posted: 8 Oct 2005, 14:41:46 UTC

Trying to make comparisons of HT vs. non-HT are hard. For one thing, OTHER than the HT capability the CPU, motherboard, memory, etc. ALL have to be the same for a valid comparison.

Laptops, CPU being the same clock speed or not, are no where compatible with the other components of a standard desktop. Laptop components WILL be slower.

Running HT on, will give you a 20-60%, nominal 40% improvement in throughput. Slower speed, but greater total performance. Exactly wat you will get depends on exactly what you are doing at the time. The HT mode works by running #2 thread when #1 thread hits a roadblock or when it is not using some of the internal components.

Like multi-tasking on a computer, HT mode is a way to get the most out of the total system ...
ID: 16485 · Report as offensive     Reply Quote
old_user98407

Send message
Joined: 15 Sep 05
Posts: 8
Credit: 205,423
RAC: 0
Message 16487 - Posted: 8 Oct 2005, 15:28:39 UTC - in response to Message 16485.  


Running HT on, will give you a 20-60%, nominal 40% improvement in throughput. Slower speed, but greater total performance.

I agree that HT will typically provide better throughput, but not even Intel suggests the high end of that range. I think 15-30% is more likely. The most obvious improvement is on a system where HT is ON, but the BOINC is set to limit to the number of physical CPUS (which is the default for a 2 CPU system). You about double your throughput because the SystemIdle process is doing real work of no benefit.

I have identical dual CPU HT-capable Xeon ia-32 Dell 650\'s. I\'ve run tests with SETI, Protein and Climate (Folding failed on floating point problem). Both systems identical down to XP patch level and software installed. Only Climate doesn\'t seem to benefit much from HT, and it isn\'t worth bouncing the machines to turn HT off just for Climate.


ID: 16487 · Report as offensive     Reply Quote
old_user98407

Send message
Joined: 15 Sep 05
Posts: 8
Credit: 205,423
RAC: 0
Message 16488 - Posted: 8 Oct 2005, 15:35:08 UTC - in response to Message 16484.  

The non-HT unit is out performing CPDN by over double. Both are doing the same projects with the same percentages

Do a CTRL_ALT_DELETE to bring up your system monitor and see what\'s using CPU on the HT machine. My bet is the BOINC projects might not be using the full CPU available, for some reason. On my HT boxes, the work units clock at twice as long, but two are produced in that period. The clock isn\'t CPU usage, but system clock. So, if two work units start at 00:00 and end at 00:59, they both show 00:59 but each took half that.

On my machines, setting to non-HT generates a LOT less heat. On a laptop, that probably means something to battery life.



ID: 16488 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2170
Credit: 64,555,907
RAC: 5,858
Message 16499 - Posted: 9 Oct 2005, 4:04:33 UTC - in response to Message 16484.  

There is a difference, the HT is a laptop, and does run hotter than the desktop. All the other projects are close to 2-1. My thought is the L2 Cache is the reason. I have not turned off HT on the laptop to test, but after seeing this for over a week, I might shut it off and see if I get a better performance hit, and a little less heat.

My P4 laptop with hyperthreading slows/throttles down when it gets hot. Instead of running at 3.06 GHz, it slows down to 2.1 GHz. Many laptops will throttle back either by slowing the CPU down GHz wise, or by inserting idle commands. You can test it with throttlewatch if you want to see if it is doing it.
ID: 16499 · Report as offensive     Reply Quote
Profile old_user15351

Send message
Joined: 8 Sep 04
Posts: 23
Credit: 121,446
RAC: 0
Message 17128 - Posted: 11 Nov 2005, 1:47:03 UTC

I have the exact same system as TheSleuth, i have hyperthreading on, but limit BOINC to 2 CPUs (the idea being that BOINC gets the real one (ideally) and i\'m using the hyperthreading bit unless i\'m doing some real work on it)

I recently tried running 2 CPDN models as well as other projects,
when i only run one model, and another project, i get about 3 s/TS.

With 2 models running together, i got about 4.0-4.5 s/TS so quite an improvement personally, so imo, it\'s worth having HT on :)
ID: 17128 · Report as offensive     Reply Quote
old_user98407

Send message
Joined: 15 Sep 05
Posts: 8
Credit: 205,423
RAC: 0
Message 17136 - Posted: 11 Nov 2005, 12:41:09 UTC - in response to Message 17128.  


With 2 models running together, i got about 4.0-4.5 s/TS so quite an improvement personally, so imo, it\'s worth having HT on :)


For the most recent 30-day period, I have been running non-HT with a 2-CPU limit on one machine, and yes-HT and 4-CPU limit on the other. The two-CPU limit non-HT machine has produced slightly higher credits (+/-5%) over that period and it is suspended for real work much more often. This might be because the sulf tests get more credit?

When you run yes-HT but two-CPU, you are letting SystemIdle take half of your capacity, at least in my tests.
ID: 17136 · Report as offensive     Reply Quote
Profile old_user15351

Send message
Joined: 8 Sep 04
Posts: 23
Credit: 121,446
RAC: 0
Message 17138 - Posted: 11 Nov 2005, 14:44:07 UTC - in response to Message 17136.  

For the most recent 30-day period, I have been running non-HT with a 2-CPU limit on one machine, and yes-HT and 4-CPU limit on the other. The two-CPU limit non-HT machine has produced slightly higher credits (+/-5%) over that period and it is suspended for real work much more often. This might be because the sulf tests get more credit?

When you run yes-HT but two-CPU, you are letting SystemIdle take half of your capacity, at least in my tests.


ah, well yes, and i suppose each could be running on a seperate physical CPU as well (leaving the 2 virtual ones free)

the reason i have 4 CPU with a limit of 2 is because i find BOINC affects system performance when doing intensive tasks, especially games, so i limit it to 2 mostly, sometimes 3 if i know i won\'t be doing anything stressful for a while (1 spare for the system to have at it\'s disposal)
ID: 17138 · Report as offensive     Reply Quote
old_user94880

Send message
Joined: 27 Aug 05
Posts: 156
Credit: 112,423
RAC: 0
Message 17463 - Posted: 26 Nov 2005, 20:12:58 UTC
Last modified: 26 Nov 2005, 20:15:11 UTC

Running a 840ee 3.2 with HT, 3 gig ram, running 6 projects; Einstein, Seti, Rosetta, LHC, CPDN, and Predictor.
Presently have 10 Ts that range from 2.1544 to 2.1648 completion time was estimated at 470 hrs but has gone up to about 530 hrs.

Just started 2 Sulphur runs on my other 840ee 3.2 with HT, 3 gig ram, estimated time 1470 hrs each...also is running Einstein and Seti.

Both running great so far.....
BOINC Wiki
ID: 17463 · Report as offensive     Reply Quote

Message boards : Number crunching : Is HyperThreading BAD for Climate?

©2024 climateprediction.net