climateprediction.net home page
Posts by old_user271

Posts by old_user271

1) Questions and Answers : Windows : Horrid bug (I think) - lost run in phase 3 with exit code of -1 (Message 1676)
Posted 25 Aug 2004 by old_user271
Post:
> They must have trimmed the EventLog writing, if you bring up TaskManager do
> you see a "hadsm3_4.03*" and "hadsm3um_4.03*" and the "boinc_cli.exe"
> running?


Everything is present and correct, Carl - the model is running.


Any comments on the other observations?



David
2) Questions and Answers : Windows : Horrid bug (I think) - lost run in phase 3 with exit code of -1 (Message 1672)
Posted 25 Aug 2004 by old_user271
Post:
An inadvertent and catastrophic piece of mistyping when upgrading from client 4.02 to 4.05 was:

boinc_gui -uninstall

(at least I think it was -uninstall - it was -something).

Of course, I meant boinc_cli -uninstall, to uninstall the service.


That resulted in the loss of my only decent run with an exit code of -1, and sadly no backup (I don\'t have the room on my tapes to backup BOINC, though I may try, especially after this, to at least make a nightly disk based backup of the BOINC folder).


I figured I may as well reset the project to pick up a new 4.03 based run. I had a couple of runs lying around with a few execution hours from multi-processor testing (this is a dual processor machine with Hyperthreading), but nothing with enough execution to be worth keeping, and I detected, possibly incorrectly, that with the bugs in the 4.02 clinet it was probably better to reset such runs.


I then made a similar typing mistake (not my night - but maybe it\'s produced a useful result) installing the 4.05 client as a service - typing

boinc_gui -install

instead of:

boinc_cli -install


and promptly lost my new 4.03 run under the 4.05 client also with an exit code of -1. If this is a bug (and it sure looks like one to me), it\'s in the release code.


I\'ve now got the new run set up as a service - and I\'m not seeing anything appearing in the application event log - unlike with the 4.02 client. I had picked up that people wanted BOINC to be less verbose in the event log - but I\'d expected something!


Is it wise to clear up all the garbage in my BOINC folder manually - 4.02 project files, also all the folders relating to my failed runs?



David
3) Message boards : Number crunching : BOINC versus legacy CPDN (Message 802)
Posted 12 Aug 2004 by old_user271
Post:
> In the alpha trial I found running 2 models with HT enabled gave 13% improved
> performance over a single model with HT disabled.

That sort of gain would be in line with my expectations for HT.


> I presume your other tasks are running at a higher priority than the CPDN
> ones, so every time they require >12.5% of the available processing power
> it's possible that a CPDN job could be bumped to another processor (this
> assumes the worst case scenario of the other tasks all being assigned to the
> same processor). If you're lucky the CPDN job will be bumped to the other
> virtual processor on the same physical one. Sods law dictates this won't be
> the case :(

I'm curious as to the derivation of the 12.5% figure - is this based on the architecture of the Windows task scheduler?


My experience is that on a machine that's really doing nothing more than idling (though there are various background tasks idle that will wake up periodically) that Windows apparently rolls climateprediction.net (I'm talking about the legacy client here, not BOINC) between the processors on my system. This happens even if you turn Hyperthreading off, so you just have two physical processors.


If I'm correct about this behaviour it is, of course, nonsensical, as it wipes out all the caching gains (my Xeons have no L3 cache, but they do have a 512K L2 cache). I don't see the point of putting any L3 cache processors in this box; they're extremely expensive even now. 3.06GHz 533MHz FSB 1MB L3 cache Xeons are currently around 480 pounds each! The same processor without the L3 cache is about 200 pounds - it's not worth spending 400 pounds for a 15% processor performance gain. I will keep an eye on prices - if I find a couple of faster processors being remaindered when the 533MHz stuff is becoming obsolete, it may be worth spending a bit of money then - though I suspect my workstation will be very long in the tooth at that point.

Rather than buying new processors for this box, it would arguably better to buy a new E7525 based Dell Precision 670 with 800MHz FSB Xeons in a largely minimal configuration (apart from, of course, memory - and a graphics card as the E7525 board is PCI Express, not AGP), and move my SCSI hard disks and other high-end peripherals to the new box.


In fact, I'm not going to do any hardware changes. I'm quite happy with the performance of this E7505 based Dell Precision 650 workstation, and will be sitting still for some time. The computers I've got are likely to serve my needs for several years to come, and by the time I'm in the market for a new computer, I expect various significant changes will have happened (64-bit may well be mainstream, for example).


> One thing that will *really* slow things down is running one or more
> visualisation. That can eat up lots of processing power.

I'm not using any visualisations - I'm running BOINC as a service (that way, it doesn't matter whether a user is logged on or not). BOINC climateprediction.net as a Windows service doesn't currently offer visualisations.


> > I do wonder whether support for explicit processor affinity in BOINC
> (and, for
> > that matter, in legacy CPDN) would help - if it's possible to add such
> support
> > to a program. If you don't set processor affinity, then, at least in
> Windows
> > XP, my impression is that the process can 'spin' between processors. On
> a
> > single processor Hyperthreading machine, there's no performance hit
> spinning
> > between the two virtual processors; they share one cache. On a machine
> with
> > more than one physical processor, there is a performance hit, as each
> > processor can only access its own cache.
>
> Being able to set processor affinity would definitely help, but you'd have to
> live with at least one of the CPDN jobs having reduced performance because of
> your background tasks. The SETI@Home command line client has an option to run
> on a fixed CPU, so it's certainly possible.

I wonder if the BOINC folk could investigate this - I believe, but am not 100% sure (I don't have the time to check MSDN) that Windows processes can set their own processor affinity on startup.

Of course, running BOINC as a service, I could sort this myself manually - if I run the BOINC service as something I can log in as, then when I reboot I can log in and set the processor affinity of my BOINC climateprediction.net processes. I may experiment with this when I've finished my legacy CPDN run and am running four BOINC models.


> > Meanwhile, I'd appreciate advice on what I should do - I suppose the
> optimal
> > way ahead is to mark my legacy CPDN account to finish at the end of this
> run,
> > ratchet up to 3 BOINC CPDNs, accept that my place in the legacy CPDN
> machine
> > league table will fall consistently from this point (sob! - I've already
> > dropped one place), but that my place in the BOINC machine league table
> will
> > be much better.
>
> My position in CPDN classic started plummeting when I started alpha testing.
> I was in the top 150 but last time I checked I'd dropped about 100 places :(
>
> But I guess that means there's lots of people active in the project ;-)

There's no way I can run BOINC and keep my machine leader board place - I think the decision is made to run my legacy CPDN run to the end, and stop there, then switch to four BOINC runs. It may not get me that spectacular a leader board place on either system, but it's about the science.


> > Then, when my legacy CPDN run finishes, I should switch to 4 BOINC
> CPDNs.
>
> I'd one box finish a THC experiment the day after the first batch of BOINC
> invites went out. I'd no hesitation in switching it over so I could run 2
> faster experiments.

If I want to finish legacy CPDN completely, presumably I remove the tick next to "account active" in my account page - that is the right way to stop my machine downloading a new run when it finishes this one, isn't it?


> > I don't suppose there's any plan to link the legacy CPDN stats with the
> BOINC
> > ones, is there?
>
> No, but the legacy experiments will continue to run for a long time yet (the
> planned OU course is built around that rather then BOINC).

I understand - basically the two will continue in parallel, though BOINC will be the way ahead, in terms of new experiments and optimised use of current and future hardware.



David
4) Message boards : Number crunching : BOINC versus legacy CPDN (Message 520)
Posted 9 Aug 2004 by old_user271
Post:
Clarifying what I wrote:

> If you don't set processor affinity, then, at least in Windows
> XP, my impression is that the process can 'spin' between processors.

My impression is that processes not only can 'spin' between processors, but that they usually *do* spin.



David
5) Message boards : Number crunching : BOINC versus legacy CPDN (Message 518)
Posted 9 Aug 2004 by old_user271
Post:
> I did some HT testing during the alpha trial. Running the BOINC client didn't
> seem to affect the performance of my legacy CPDN.

Running BOINC affects legacy CPDN here (just as a recap, the machine is dual Xeon 2.66GHz, Hyperthreading on, Windows XP Professional SP1). The following discussion is inevitably Windows-centric.

You can take a look at my legacy CPDN trickles at http://cpdn.comlab.ox.ac.uk/user/trklm.php?mid=12393


In the first period where the seconds per trickle jumps, I was running as many as 3 BOINC CPDNs. I dropped back to 1 BOINC CPDN subsequently and the time period per trickle dropped, but there's still around a 40% speed penalty over running without BOINC.

I'm somewhat unclear why this is the case, what I can do about it, and what I should do about it.


The machine does have some real work to be getting on with as well as crunching climateprediction.net models. Being the biggest machine by far on my network, it is my backup server, it runs all my network and UPS monitoring and it's my mail server. It doesn't run a server version of Windows because I don't really need a Windows server on my relatively small network so didn't want to pay for the server OS, also Dell doesn't support server versions of Windows on this hardware - not that I think I would have a problem with drivers if I did install Windows Server 2003 - it's a relatively standard Intel E7505 chipset motherboard, and both the motherboard SCSI and Gigabit Ethernet chips (components not necessarily on an E7505 motherboard) have Windows Server 2003 drivers, as does my graphics card (nVidia Quadro FX500).

Clearly, then, my processor resources aren't idle all the time that the machine isn't in foreground use - there's probably more like 1.5 or 1.6 processors spare (plus Hyperthreading on those). With about 57 hours of uptime since reboot, the various other tasks the machine has run have taken around 300 minutes of CPU time - whatever that is in real execution time bearing in mind Hyperthreading is on.


I do wonder whether Hyperthreading is part of the reason for this result. Two processes executing on one processor using Hyperthreading will not complete in the same time as one, barring the very occasional case. The gain from Hyperthreading can, if my understanding is correct, often be no better than a few percent. That said, my understanding is that the task scheduler in Windows XP (and Windows Server 2003 - but not Windows 2000) is intelligent enough to use spare physical processors before logical processors.


I have to say that this result is in line with what I predicted - though apparently Thyme Lawn's results are different (maybe his machines are genuinely idle when they're not in use, with no background tasks).


Whatever other people's results, it looks like I'm not going to be able to have the best of both worlds - if I run any BOINC, then my throughput on classic CPDN will drop unless a version of classic CPDN is made available that has "Below Normal" priority as an option instead of "Low".

I guess, too, at some point if I want maximum BOINC throughput, I'll have to cut away from legacy CPDN to all BOINC - at which point I may as well run four BOINC instances to use every last shred of spare processor resources in the machine (my disk resources are fast - the main disks in the machine are a pair of Seagate Cheetah 15K.3 SCSI drives, one of the fastest HDs you can get).


At the moment, I think I have the worst of both worlds - degraded legacy CPDN performance and sub-optimal BOINC throughput!


I do wonder whether support for explicit processor affinity in BOINC (and, for that matter, in legacy CPDN) would help - if it's possible to add such support to a program. If you don't set processor affinity, then, at least in Windows XP, my impression is that the process can 'spin' between processors. On a single processor Hyperthreading machine, there's no performance hit spinning between the two virtual processors; they share one cache. On a machine with more than one physical processor, there is a performance hit, as each processor can only access its own cache.

There's people here who are far more experienced with BOINC and probably more experienced with IA-32 architecture than I am - I raise this simply for what it's worth based on my less than complete understanding of both subjects.


Meanwhile, I'd appreciate advice on what I should do - I suppose the optimal way ahead is to mark my legacy CPDN account to finish at the end of this run, ratchet up to 3 BOINC CPDNs, accept that my place in the legacy CPDN machine league table will fall consistently from this point (sob! - I've already dropped one place), but that my place in the BOINC machine league table will be much better.

Then, when my legacy CPDN run finishes, I should switch to 4 BOINC CPDNs.


I don't suppose there's any plan to link the legacy CPDN stats with the BOINC ones, is there?


Any thoughts are welcome,




David
6) Message boards : Number crunching : BOINC versus legacy CPDN (Message 210)
Posted 6 Aug 2004 by old_user271
Post:
> A hyper-threading processor counts as 2 processors. If you set your
> preferences to use a maximum of 3 processors you should be able to run 3 BOINC
> clients and your legacy CPDN at the same time.

True - though remember that an HT processor only has one set of execution resources. Add to that the lack of use of explicit processor affinity (apparently) by BOINC and by legacy CPDN, and I fear any more than one BOINC model executing has the potential to hurt legacy CPDN throughout.
7) Message boards : Number crunching : BOINC versus legacy CPDN (Message 199)
Posted 6 Aug 2004 by old_user271
Post:
I\'ve installed the beta BOINC client this afternoon and am running BOINC on one CPU as a Windows service. I\'ve also left my legacy CPDN setup running.

The hardware here is a dual Xeon 2.66GHz machine (Windows XP Professional SP1) with Hyperthreading on - so I have the potential to run four CPDNs at once using BOINC.


I care somewhat about my machine ranking - and appreciate that BOINC statistics are as yet uncertain. I\'m about 35% into Phase 1 of legacy CPDN at the moment.

For now, I think I\'ll leave things set as they are (if I increase BOINC to multiple CPUs, legacy CPDN throughput will likely drop as I only have two actual CPUs).

Obviously if I decide to discontinue legacy CPDN, I\'ll uncheck the option somewhere in my account that means my CPDN client will finish after the end of the run.


I guess what I\'m after is some ideas as to how the BOINC migration will be made in the end - for which I guess the answer, right now, is \"we haven\'t decided\". Presumably, though, there\'ll come a point after the BOINC public launch where legacy CPDN will be turned off - once machines finish a run, they won\'t be offered another.
8) Questions and Answers : Windows : Boinc Client as Windows Service (Message 197)
Posted 6 Aug 2004 by old_user271
Post:
So far as I can tell, the Windows Service client is already compiled. Following the information in the BOINC web pages at http://boinc.berkeley.edu/service.php run boinc_gui -install (probably best if you include the full path to boinc_gui in the command line!)

On Windows XP, I found I had to grant "Network Service" permissions (I chose Full Control) of the BOINC folder and all subfolders before it worked properly. This runs against the web page - maybe it's a BOINC 4 change.


This seems (to me anyway) a better way of running things - though I'm unclear what dependency issues this gives me with the GUI application, and what happens with the GUI application with users logged off (presumably it can't run).



David (who is completely new to BOINC and may be talking rubbish)




©2024 climateprediction.net