climateprediction.net home page
Posts by Dave Peachey

Posts by Dave Peachey

1) Message boards : Number crunching : Iceworld Appeal (Message 38610)
Posted 1 Jan 2010 by Dave Peachey
Post:
Iain,

I\'ve another iceworld (hadsm3mh_kunl_006488661) for which I\'ve managed to capture the key files at the second attempt. As with a number of other models, it seems to go blue at a point just off the western US coast.

I\'ve subsequently aborted the model but have a backup from an hour or so before the blueness so could re-run it if required. I\'ve also emailed a zip file to your previously advised email address.

Cheers
Dave
2) Message boards : Number crunching : Iceworld Appeal (Message 38338)
Posted 21 Nov 2009 by Dave Peachey
Post:
Iain,

Update:

Hah, caught it at timestep 11577 - I even had the graphics turned on just at the point it tripped over so was able to watch it go competely blue over a couple of timesteps.

Just to confirm it, I re-ran the last few timesteps (was able to switch of the model before it did a checkpoint) and it froze at the same t/s three times straight. Seemed to spread from the US west coast as per others mentioned above.

Sorry, this is all a bit sad but it\'s the first time in years I\'ve caught one blue-handed (so to speak)!

Ready if/when you are
Dave
3) Message boards : Number crunching : Iceworld Appeal (Message 38335)
Posted 20 Nov 2009 by Dave Peachey
Post:
Iain,

I\'ve likely got another iceworld for you - hadsm3fub_kbz7_006472701 went blue somewhen before 35.5% complete so I\'ve wound it back a ways (currently at just beyond 34%) and I\'m re-running with recording switched on.

It will probably be a day or so until it hits the blue wall again (I didn\'t catch the exact point first time around) but a note of the email address to which to send the \'.cpdn\' file would faciltate a speedy upload of the appropriate file.

Cheers
Dave
4) Message boards : Number crunching : Unable to upload 2 zip files (Message 37102)
Posted 8 Jun 2009 by Dave Peachey
Post:


Yes, that's a possibility. It worked last time, back in 2005.



Les,

Not sure if you're referring to what I suspect (might be restating the obvious, in which case, apologies) - some while back (probably was 2005) in the early days of the project there was a process for manually uploading the zip files so the project admins could ensure no results were lost. They would then manually set the individuals stats to show "task completed" whilst ensuring full credit was given for the work done.

Having said that, file sizes for some of the more recent applications seem to be significantly larger and, with more people crunching more for the project, I wonder whether that would still be an option.

For example, I have four runs actually or nearly finished and, of the two for which I'd saved out the eight resulting zip files (long memory suggested there might be a problem before I read this thread), the combined file size was pushing 50MB (and will double for the remaining two applications nearly/actually completed today).

Not sure what I'm trying to impart (other than sharing an old experience) but I do wonder whether an interim, manual upload process would be practical (assuming, of course, there's somewhere to store the results)?

Cheers
Dave
5) Message boards : Number crunching : New Problem (Message 5547)
Posted 21 Oct 2004 by Dave Peachey
Post:
Guys,

I'm getting similar problems - it started with BOINC v4.05 and now continues with v4.09 (I haven't found, and don't intend to install, v4.13 until it's on the CPDN download page). The machine is an un-overclocked dual Athlon MP2800 system running Windows XP and, in essence, I've had two slightly different things happen - each with equally disasterous results:

1) Firstly with v4.05 BOINC client - last weekend, having got back from several week's holiday, I found that one of the two WUs had completed (the other was 99+% complete) but hadn't uploaded because my i/net connection had switched off in my absence. I reset the i/net connection, did a manual update, got the system to trickle and download two new WUs to replace the one complete and one nearly complete WU. However, as the completed WU didn't seem to want to upload, I suspended and then closed the BOINC application, rebooted the machine and restarted BOINC - only for the system to lose all references to my attachment to the CPDN project, clear out all WUs (completed, partially completed and both new ones)!

So, I uninstalled BOINC v4.05, cleared out the CPDN directory and installed BOINC v4.09, reattached to CPDN and then downloaded two new WUs. So far so good (hah bloody hah!), you might think ...

2) This evening, I did a couple of o/s critical updates and had to reboot the machine so, as per usual, I suspended and then closed the client before rebooting. However, on restarting the BOINC GUI executable had gone missing (as in, has been deleted!). Having no other obvious course of action, I reinstalled the v4.09 BOINC application but, for some unknown reason, the system chose to malfunction and upload the the existing partially completed WUs with client error messages.

Yes, I did manage to download another pair of WUs but these two malfunctions in less than a week have cost me six WUs - two complete or as near as damnit (forty or more days processing lost forever), two unstarted WUs and two partially completed WUs uploaded with client errors.

To say that I'm pissed off would be an understatement! I _presume_ due to a lack of any previous troubles with the machine when running "classic" CPDN that the BOINC client is the problem area. I can't say this is leaving me overly impressed!!

This puts my sorry statistics for my three machines since I started CPDN on BOINC on 5th August at:
23 = downloaded WUs
2 = uploaded results (completed successfully)
8 = aborted (uploaded with "client error" results)
5 = lost forever
8 = still processing
Not a good record so far - for me or for BOINC!

Dave
6) Questions and Answers : Windows : Preferences not applying correctly (Message 1545)
Posted 23 Aug 2004 by Dave Peachey
Post:
Shelby,

If you're still having a problem with this, have a look at a thread I created on <a href="//climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=259">a similar topic here</a>.

Whilst my own resolution may not help you, Carl's final comments re bugs in the BOINC scheduler software may, at least, help to explain why you've had the problem.

Cheers
Dave
7) Questions and Answers : Windows : Win2K Pro - only downloading one WU on dual CPU machine (Message 1509)
Posted 22 Aug 2004 by Dave Peachey
Post:
Carl,

Dang, you caught me between message edits - it looks as though today's trashed WU has uploaded an error result after all so this one won't hang around pretending to be unfinished!

However, I'm still confused about the (seemingly apparent) difference in the response of the scheduler I have encountered just by creating a set of "Home" preferences.

Cheers
Dave
8) Questions and Answers : Windows : Win2K Pro - only downloading one WU on dual CPU machine (Message 1506)
Posted 22 Aug 2004 by Dave Peachey
Post:
OK, try this one ...

I decided to have a dicker about with Carl's optimised client on my recently installed dual Athlon MP2800 computer - this initially caused a client failure and, because I forgot to do a backup before updating it, in the process took out another partially-completed WU (although it does, at least, seem to have registered this one as an incomplete/error client run so this one's not going to hang about). Needless to say, I wasn't happy, but put it down to my own stupidity, downloaded another WU and started the system working again - although with the original executable.

Note that, at this point, I hadn't made a changes to my preferences but have (on a periodic basis throughout yesterday and today) been running the "update" process to try to download another WU for that machine. Then, for some reason, having bollixed things up (just for a change!) I decided to go to my personal preferences and, for something to do, created a supplementary set of "Home" preferences which mirrored my "Default" preferences.

So the situation is ... having initially got this machine restarted with just the one new WU and _then_ having creating the additional "Home" preferences, when I stopped/restarted the client, for reasons I can't explain, it now recognised the second CPU, gave me an "imminent scheduler starvation" message and downloaded a second WU for that machine.

My question is, therefore, why would it (apparently) not play ball in respect of the two processors on my dual Athlon machine with the "Default" preferences only (although it was quite happy to do so with my dual Xeon machine two weeks ago) but, once I'd created some supplementary albeit exactly equivalent, "Home" preferences it realised it needed more work and did something about it?

Cheers (from a marginally more confused than usual)
Dave

PS: Can we also have some way of doing HTML tagging in this forum as it's a pain not to be able to emphasise text in bold and/or italics?
9) Questions and Answers : Windows : Win2K Pro - only downloading one WU on dual CPU machine (Message 1421)
Posted 21 Aug 2004 by Dave Peachey
Post:
Carl/Thyme,

Thanks for the comments/thoughts.

My resource sharing for all my machines is set to 100% CPDN so I don't believe that this is the source of the problem.

I've also considered the available/allocated disk space issue and, given that the available partition space (for the CPDN/other DC client partition) on both the dual Xeon and dual Athlon machines is currently set to 3GB, it would be somewhat of a surprise to find that the dual Athlon is suffering as a result of insufficient available disk space - especially given that the dual Xeon currently has less than 2GB of available space for three BOINC CPDN clients plus an instance of the non-BOINC client (actually, that's where the 1.8GB of available disk space comes in) whereas the dual Athlon has almost 2.5GB of available disk space for once instance of each client!

My inclination, therefore, is also to discount this as the root cause of the problem - based, also, on the lack of any of the "insufficient disk space" messages I remember getting when I set up the preferences when starting the dual Xeon machine - as I've had no equivalent error log messages to this effect for the dual Athlon machine.

Thyme's added comments re the ability to invalidate/release WUs arising from detachment from the project (especially in the early days of the public BOINC launch from the end of next week) is a good point and reinforces my request that some means of user/admin-driven manual release of known invalid/defunct WUs be implemented.

Any more ideas?

Cheers
Dave
10) Questions and Answers : Windows : Win2K Pro - only downloading one WU on dual CPU machine (Message 1386)
Posted 21 Aug 2004 by Dave Peachey
Post:
I've been running CPDN very happily on a dual P4 Xeon machine running WinXP Pro (SP2) which has four virtual CPUs, three of which are in use for CPDN (as I wanted!) and have now migrated a second box across - this one is a dual Athlon MP2800-based machine running Win2K Pro (SP4).

Having installed the software on this box without a problem and noted in the computer's profile that it is recognised as having two CPUs, I am concerned that the scheduler is only downloading a single WU for this machine. I am getting no obvious error messages and my preferences are set up in such a way that - as far as I can tell - I really shouldn't have this problem (max CPUs=4; max disk space=5GB; no. days work held=40).

I have tried (repeatedly) to update the system and, although the scheduler responds very nicely - I get no error messages, no indications of lack of disk space or imminent scheduler starvation - it gives me no more work. In desperation, I have also detached from, and reattached to, the CPDN project. The only effect of this was to trash the original workload (BTW, how can you remove a WU from your account which you're never going to complete? In case someone feels like resolving this, it's WU ID 8398) and download another single WU.

Fortunately (?!) I have a spare CPDN "classic" WU which I had set aside to get my dual Athlon machine running under BOINC (it was to be completed later on the dual Xeon box when its current CPDN "classic" WU completed) so I've reinstalled that in order to use up the CPU cycles on the second processor. I would, however prefer to be running the BOINC client on both CPUs of this new machine.

Is this a bug with Win2K Pro, a (not so obvious) scheduler problem or something else? Any thoughts?

Cheers
Dave
11) Message boards : Number crunching : BOINC versus legacy CPDN (Message 482)
Posted 9 Aug 2004 by Dave Peachey
Post:
The following may be of interest to this thread ...

I have, for some time, been running CPDN on three dual processor machines - two are Tyan Tiger MPXs running MP2800s and the third is a Supermicro board running P4 Xeons at 3.2 GHz. In order to maximise instances of CPDN, I have, up to now, run two instances of CPDN classic per machine - one native and one inside a VMWare shell running Windows 2000.

I have now maximised the use of my P4 Xeon by adding CPDN BOINC beta instances on the other two "processors" not previously used. This machine is giving me the following s/ts for the various instances of CPDN (as measured by CPFarmView for the instances of CPDN classic):
- CPDN classic (native) 5.26 s/ts
- CPDN classic (VMWare) 5.45 s/ts
- CPDN BOINC 1 3.82 s/ts
- CPDN BOINC 2 3.78 s/ts

It would appear that, in some fashion, the BOINC CPDN client is more efficient than the CPDN classic!

Unless someone has a better reason?

Cheers
Dave




©2024 climateprediction.net