HADAM3P 6.06 on One Core of 4-Core Computers?

Author	Message
rbpeake Send message Joined: 27 Feb 08 Posts: 41 Credit: 1,402,356 RAC: 0	Message 36667 - Posted: 9 Apr 2009, 20:16:45 UTC I see from the News section that the HADAM3P 6.06 is recommended to run on only one core of a multi-core machine. Assume that applies to both 2-core and 4-core machines? Guess if one is really \"hungry\" to work on the HADAM3P 6.06, one could run on two or more cores and live with the slowdown, figuring at the end of the day more HADAM3P\'s will be processed over a unit time period then if they were run serially. Regards, Bob P. ID: 36667 · Reply Quote

Les Bayliss Volunteer moderator Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0	Message 36668 - Posted: 9 Apr 2009, 21:15:52 UTC Yes, it\'s just to do with the slow down due to resource contention. More of a warning, really, so that people know to expect slow downs the more they load their computer with this model. But it applies to all model types to some extent. I\'ve run 4 at a time on beta, and currently have 3 production models running on one computer. ID: 36668 · Reply Quote

rbpeake Send message Joined: 27 Feb 08 Posts: 41 Credit: 1,402,356 RAC: 0	Message 36669 - Posted: 9 Apr 2009, 22:12:07 UTC - in response to Message 36668. Last modified: 9 Apr 2009, 22:14:19 UTC Yes, it\'s just to do with the slow down due to resource contention. More of a warning, really, so that people know to expect slow downs the more they load their computer with this model. But it applies to all model types to some extent. I\'ve run 4 at a time on beta, and currently have 3 production models running on one computer. Thanks! Actually, I have found my single core AMD 3500+ is running this model much more efficiently relatively speaking than my Q9550 running this model on one core, and using the other 3 cores for other models. It still is slower than the quad, of course, but not so much as compared to other Boinc projects. Good news for my single core, which usually is quite the slow one compared to my quad! ;) Regards, Bob P. ID: 36669 · Reply Quote

mo.v Volunteer moderator Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0	Message 36675 - Posted: 10 Apr 2009, 19:52:01 UTC In the News post about this model I didn\'t intend to specify only running one of these at a time on a quad, though even two running simultaneously usually seems to cause some slowdown. Members will have to look at what their own computer can handle. Yes, other model types can also slow each other down but the slowdown seems to be greater in the case of HadAM3Ps. If any members can compare and post the speed of HadAM3Ps when running 1, 2, 3 or 4 of them simultaneously this would be useful. Speed comparisons on dual-cores would also be useful. Cpdn news ID: 36675 · Reply Quote

DaveG27 Send message Joined: 8 Nov 06 Posts: 18 Credit: 2,425,895 RAC: 0	Message 36677 - Posted: 10 Apr 2009, 20:19:50 UTC Last modified: 10 Apr 2009, 20:20:03 UTC When I had 3 HadAM3P\'s running simultaneously on a quad and a HadCM3 on the other core they trickled at 9.5s/ts when I changed it to 1 HadAM3P, 1 HadCM3 and 2 other projects it speed ed up to 4.7s/ts I also worked out the HadCM3 was running 30-40% slower with 3 HadAM3P. Dave ID: 36677 · Reply Quote

old_user81594 Send message Joined: 11 Jun 05 Posts: 67 Credit: 1,222,916 RAC: 0	Message 36699 - Posted: 12 Apr 2009, 10:12:21 UTC - in response to Message 36677. When I had 3 HadAM3P\'s running simultaneously on a quad and a HadCM3 on the other core they trickled at 9.5s/ts when I changed it to 1 HadAM3P, 1 HadCM3 and 2 other projects it speed ed up to 4.7s/ts I also worked out the HadCM3 was running 30-40% slower with 3 HadAM3P. Dave I did read about the slow down the other week - also the crazy RAC situation with the HADAM3P models. I seem to be getting acceptable credits per day, if a little low for a 950D and a Q6600 machine. Combined over the last few days they average out at about 1870, but my RAC is 3909 !!!! Back to the s/TS. I have 3 x HADAM3P\'s running and a Mid-Holocene and the three AM3P\'s are 8.1, 6.8 and 7.0s/TS and the mid-Hol is running at 1.8s/TS on my Q6600. Now, on my 950D (3.4GHz dual-core, 800MHz FSB) I have an AM3P running at 6.8s/TS and a mid-Holocene running at 1.65s/TS. I\'ll experiment now for 48 hours or so to see if I can run just one AM3P on my Q6600 to get a speed-up. Will report back in a couple of days. Neil. ID: 36699 · Reply Quote

old_user81594 Send message Joined: 11 Jun 05 Posts: 67 Credit: 1,222,916 RAC: 0	Message 36700 - Posted: 12 Apr 2009, 10:24:40 UTC - in response to Message 36699. [/quote] I\'ll experiment now for 48 hours or so to see if I can run just one AM3P on my Q6600 to get a speed-up. [/quote] Sunday 12th April: On my Q6600, current projects running are now 2 x Wold Community Grid and 2 x CPDN. Of the two projects I have one HADAM3P (currently at 13.9% and 8.12s/TS), and an SM3 Mid-Holocene 6.02 (currently at 53.1% and 1.97s/TS) (The figures in my previous post were actually taken about 16 hours ago, so you can already see a slow-down. Let\'s see if that helps. On my 950D, I\'m going to pause the mid-Holocene so it is just running one model on a dual-core machine (it has DDR2 PC6400 RAM so I hope that helps) but as the L2 cache is dedicated to each core, maybe the improvements won\'t be so dramatic. Regards, Neil. ID: 36700 · Reply Quote

old_user81594 Send message Joined: 11 Jun 05 Posts: 67 Credit: 1,222,916 RAC: 0	Message 36704 - Posted: 12 Apr 2009, 21:01:42 UTC - in response to Message 36700. Sunday 12th April: On my Q6600, current projects running are now 2 x Wold Community Grid and 2 x CPDN. Of the two projects I have one HADAM3P (currently at 13.9% and 8.12s/TS), and an SM3 Mid-Holocene 6.02 (currently at 53.1% and 1.97s/TS) Well after just 11 hours of so, my HADAM3P is now down to 6.63s/TS, from 8.1 - all four cores are running Boinc (2 x CPDN and 2 x WCG) ...but no CPDN stats uploaded today I see??? Neil. ID: 36704 · Reply Quote

Les Bayliss Volunteer moderator Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0	Message 36705 - Posted: 12 Apr 2009, 21:12:08 UTC ...but no CPDN stats uploaded today I see??? Server problems. See this news post. ID: 36705 · Reply Quote

Virtual Boss* Send message Joined: 14 May 08 Posts: 29 Credit: 776,852 RAC: 0	Message 36795 - Posted: 24 Apr 2009, 9:56:53 UTC Holy cow, batman - it's flying..... I recently downloaded my first HADAM3P on a single core VMWare virtual machine with affinity to one core of a Q6600 quad. (@ 2.52GHz) Estimated runtime was ~170 Hrs. It is now at 27.905% in 19:45:00 which indicates completion at under 71 Hrs. First six trickles have s/ts of 3.5 +/- 0.03 Awesome! ID: 36795 · Reply Quote

astroWX Volunteer moderator Send message Joined: 5 Aug 04 Posts: 1496 Credit: 95,522,203 RAC: 0	Message 36799 - Posted: 24 Apr 2009, 18:25:15 UTC Last modified: 24 Apr 2009, 18:38:30 UTC Seems about right. I have one ready to report, showing 61:50 on a Q9300 running stock speed (2.5 GHz), no affinity set. (It ran with three HadCM3 and also had a large initial estimated run time.) Edit: General information. See DaveG27's post in this Thread. This Model type doesn't play well with others of its type, thanks to memory contention. The issue is discussed with more detail elsewhere. The point is, you'll lose much of what you see if you run more than one at a time. (At one point in Beta, I had four going on a Q9550. I don't recall particulars but it was grim.) "We have met the enemy and he is us." -- Pogo Greetings from coastal Washington state, the scenic US Pacific Northwest. ID: 36799 · Reply Quote

Iain Inglis Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317	Message 36800 - Posted: 24 Apr 2009, 19:37:11 UTC Here are some results for HADAM3P on two Intel quads. Speed CPU/ts 1 2 3 4 ------ ---- ---- ---- ---- Q9550 2.56 2.90 3.47 4.30 Q6600 3.25 3.73 4.49 5.77 Work (Absolute) cr/hr 1 2 3 4 ----- ---- ---- ---- ---- Q9550 38.6 68.2 85.5 92.1 Q6600 30.4 53.1 66.1 68.6 Work (Relative) cr/hr 1 2 3 4 ----- --- --- --- --- Q9550 1.0 1.8 2.2 2.4 Q6600 1.0 1.7 2.2 2.3 The last table shows that the HADAM3P models are not very scalable: a perfect quad would produce four times as many credits per hour when running four models as against one model - the real figure seems more like 2.3 - 2.4. However, a bit more work is done as each model is added. As a comparison, slabs on the Q9550 give a ratio of 3.3 for four models running as against one model; the original HADAM3 gives only 2.6. ID: 36800 · Reply Quote

JIM Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,097,287 RAC: 2,957	Message 36805 - Posted: 25 Apr 2009, 6:02:25 UTC - in response to Message 36795. Holy cow, batman - it's flying..... I recently downloaded my first HADAM3P on a single core VMWare virtual machine with affinity to one core of a Q6600 quad. (@ 2.52GHz) Estimated runtime was ~170 Hrs. It is now at 27.905% in 19:45:00 which indicates completion at under 71 Hrs. First six trickles have s/ts of 3.5 +/- 0.03 Awesome! Hi, It certainly is a pleasure watching the AM3PÃ¢â‚¬â„¢s zip along at appromx. 12% per day(!!!!!) on my 2 snails. Finishing one is about 100 hrs. is delightful. I am used to seeing 80 yr. CM models doggedly crawl along a 0.8% per 24hrs. Even the Mid-Holocene models take about 4 weeks to finish. ID: 36805 · Reply Quote

DJStarfox Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370	Message 36819 - Posted: 27 Apr 2009, 3:51:52 UTC - in response to Message 36805. It certainly is a pleasure watching the AM3PÃ¢â‚¬â„¢s zip along at appromx. 12% per day(!!!!!) on my 2 snails. Finishing one is about 100 hrs. is delightful. I am used to seeing 80 yr. CM models doggedly crawl along a 0.8% per 24hrs. Even the Mid-Holocene models take about 4 weeks to finish. Yeah, if I'm running just one model with other projects, I can get an HadAM3P model done in 48 hours of crunch time. That's a far cry to my old system that took 6 months to run it's first HadCM3. ID: 36819 · Reply Quote

mo.v Volunteer moderator Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0	Message 36822 - Posted: 27 Apr 2009, 13:00:46 UTC Last modified: 27 Apr 2009, 13:03:45 UTC However, it's not realistic to compare those two model types in terms of their efficiency. The HadCMs do a complete 160-year run on one computer whereas the HadAM3Ps each do a two-year part of a longer run, so lots of them need to be stitched together to obtain the final data for the researchers. That's why they last 25 months, not 24. The models overlap. One model's completion data provides the starting data for the following two-year run of that series. But I do agree that it's very satisfying to rack up a lot of completions in just a few weeks. I'm finding that two HadAM3Ps processed in tandem on my Core2Duo slow down by about 8% compared with a HadAM3P run together with a HadAM. The big slowdown seems to be in the case of quads that run 3 or 4 HadAM3Ps together. I'm guessing that whatever the specific resource is that these models compete for, quads mustn't usually have double the amount of it that duos have. But I haven't yet tried running a HadAM3P completely solo on the Core2Duo so I could be mistaken. Cpdn news ID: 36822 · Reply Quote

Pete B Send message Joined: 26 Aug 04 Posts: 67 Credit: 9,356,824 RAC: 4,847	Message 36823 - Posted: 27 Apr 2009, 22:17:37 UTC - in response to Message 36819. Yeah, if I'm running just one model with other projects, I can get an HadAM3P model done in 48 hours of crunch time. That's a far cry to my old system that took 6 months to run it's first HadCM3. Well, that's better than my original system for this project that took 4 months to run it's first HadSM3, albeit a 4 phase one so 1 month per phase. My current top system, a Phenom 9950 @ stock 2.6GHz is at present running 1 HadCM3 160 yr run (2.03 s/ts), 2 HadAM3P's (3.9 s/ts) & is just about to complete a HadSM3 (1.7 s/ts) in a Linux VirtualBox after which it will begin a HadAM3P in the Virtual machine too. In the current setup, it takes about 75 hrs to do the HadAM3P's but I haven't tried one HadAM3P on its own. No doubt with the AM3P running in the virtual machine, they will all slow down somewhat though. ID: 36823 · Reply Quote

DJStarfox Send message Joined: 27 Jan 07 Posts: 300 Credit: 3,288,263 RAC: 26,370	Message 36824 - Posted: 28 Apr 2009, 2:57:57 UTC - in response to Message 36823. My current top system, a Phenom 9950 @ stock 2.6GHz is at present running 1 HadCM3 160 yr run (2.03 s/ts), 2 HadAM3P's (3.9 s/ts) & is just about to complete a HadSM3 (1.7 s/ts) in a Linux VirtualBox after which it will begin a HadAM3P in the Virtual machine too. In the current setup, it takes about 75 hrs to do the HadAM3P's but I haven't tried one HadAM3P on its own. No doubt with the AM3P running in the virtual machine, they will all slow down somewhat though. For an Opteron CPU, I'd say no problem with slowdown. But I'm not sure about the Phenom chips. I know the Phenom II chips make a significant improvement, especially virtualized guests. This is because those CPUs can do nested page tables (cpu feature NPT). The Phenom II chips are under $200 with shipping if your MB has an AM2+ socket. Those performance numbers look about right for what you have. ID: 36824 · Reply Quote

Pete B Send message Joined: 26 Aug 04 Posts: 67 Credit: 9,356,824 RAC: 4,847	Message 36825 - Posted: 28 Apr 2009, 8:09:50 UTC - in response to Message 36824. Last modified: 28 Apr 2009, 8:12:21 UTC For an Opteron CPU, I'd say no problem with slowdown. But I'm not sure about the Phenom chips. I know the Phenom II chips make a significant improvement, especially virtualized guests. This is because those CPUs can do nested page tables (cpu feature NPT). The Phenom II chips are under $200 with shipping if your MB has an AM2+ socket. Those performance numbers look about right for what you have. Yes, it's an AM2+ fully compatible MoBo allowing the use of AM3 CPU's in AM2+ mode. There is plenty of potential O/C overhead available that I haven't tried yet with the combination of this MoBo with the ACC facility, the Phenom BE (unlocked multiplier) & 8Gb of DDR2 1066 certified RAM that is only running at 800, although with 4 sticks, that's probably all one would get that was stable. I did try enabling the nested paging in the virtual machine but it didn't make any difference, at least in these applications. Sun didn't make it obviously clear in their current VirtualBox documentation that this facility is only enabled in the Phenom II. Something I wasn't aware of either until your post here! The newly started Linux AM3P is @ 4.5 s/ts after 2 trickles with, as expected, the other 2 AM3P's and the HadCM3 in Windows all slowing slightly now. ID: 36825 · Reply Quote