climateprediction.net home page
Posts by Aaron Doucett

Posts by Aaron Doucett

1) Message boards : Number crunching : Can't upload for 12 days (Message 41579)
Posted 1 Feb 2011 by Profile Aaron Doucett
Post:
Good news, everyone! For the first time in many days, my famous models are uploading! Thanks everyone at CPDN who helped to fix the problem.

Now I wish I hadn't cancelled several finished famous models that were "clogging the tubes" so to speak.

Lets just hope the server doesn't get hit too hard from the surge of people now trying to upload models...
2) Message boards : Number crunching : Can't upload for 12 days (Message 41563)
Posted 30 Jan 2011 by Profile Aaron Doucett
Post:
I understand this has been discussed before, but I believe I am having a similar issue. Unfortunately, because the famous models that have completed will not upload, the finished hadam3p models and trickels are getting stuck in the queue and can't upload either (If there are only Hadam3p models present, they upload fine as the FAMOUS ones are not stuck in front of them.

This has led to having a major backlog of models on several of my machines. I am hoping the models will be able to upload normally soon, but I am considering aborting the finished models if much more time goes by without them going through.


This is a screenshot showing my upload Transfer page in the BOINC manager



Current tasks:



Any ideas on if/when uploading functionality would be restored, or any possible solutions on my end?

Note: this is not a localized problem on one machine. Every single PC I've checked that I have running models is not able to upload FAMOUS at this time. [/url]
3) Message boards : Number crunching : No credits in last 3 days??? (Message 41283)
Posted 13 Dec 2010 by Profile Aaron Doucett
Post:
If you look at the graph above it seems there is a large spike as of today... perhaps the credits finally got counted for the past couple days? I noticed a drop in the RAC from the past couple days but I think now that it's showing up again it will compensate for that. Also, I saw uploads on some of my machines that were stalled and just started working again today. A big thanks to all the server admins who (try to) keep things running smoothly!

Also a quick question, Do sites like BoincStats pull the stats from CPDN at different intervals than CPDN posts theirs? I see variations in the data occasionally.
4) Message boards : Number crunching : No credits in last 3 days??? (Message 41269)
Posted 12 Dec 2010 by Profile Aaron Doucett
Post:
Looking at the BOINC Stats page I see there is a bit of an issue for all users with credit uploads..

Here's the graphs I noticed:







It seems the progress occasionally stalls but then picks up with one day of "exaggerated" credit, making up for the days that were reporting none.

I'll keep an eye on my uploads to make sure they don't get stuck in that phase, but hopefully everything will be running smoothly again! [/img]
5) Message boards : Number crunching : Performance on hadcm3igeo "coupled" models (Message 41234)
Posted 5 Dec 2010 by Profile Aaron Doucett
Post:
RAC helps the new guys who haven't been on the project as long but helps rank the power of your "cluster". I only started CPDN in October but look at the top participants list! :D

Also I helped my team climb in the top 10 slot worldwide.

I have no problem running models that take several days but if it's at the detriment of our RAC it's not so fun!
6) Message boards : Number crunching : Performance on hadcm3igeo "coupled" models (Message 41195)
Posted 1 Dec 2010 by Profile Aaron Doucett
Post:
Hi, I've recently seen some of my computers running this new type of model. What is alarming to me is that while the famous models generally were taking around ~300 hours on (almost exclusively) Intel dual core machines, I am now seeing To Completion times of over 1000 hours without being in an "ice-world' state. The seconds per time step are over 2.0 in some cases and while I can tell these models are more complex, I am wondering if this is normal. I would rather not have to see a major drop in RAC if there is some kind of incompatibility with this model type.

I appreciate any info!!
7) Questions and Answers : Windows : HADSM3 Model "stuck" (Message 40998)
Posted 9 Nov 2010 by Profile Aaron Doucett
Post:
there are definitely problems with that machine... Today the boinc detached from the client all on it's own... and again downloaded 4 new models. Something is not right! (This is after a complete reinstall as well)
8) Questions and Answers : Windows : HADSM3 Model "stuck" (Message 40991)
Posted 8 Nov 2010 by Profile Aaron Doucett
Post:
I've noticed some strange behavior from the computer http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1104234

which I think is related to stability issues. I've cleaned it up a bit and am HOPING that I get more consistent performance as I was getting before. It's a shame to have the Average credit for one of your fastest PC's halved because of system lockups!

As far as the folder naming C:\ , That is the actual name of the folder on the machine. A typo I am thinking.

Even though there were models in progress, that same machine seems to have already downloaded all new tasks. (4 FAMOUS models... interesting)

Hope this clears a couple things up at least
9) Questions and Answers : Windows : HADSM3 Model "stuck" (Message 40978)
Posted 5 Nov 2010 by Profile Aaron Doucett
Post:
I believe I have some information that might help you.

If you look at my error while computing list,

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?userid=100701&offset=0&show_names=0&state=5

You will find that Computer 1104234 has had several "errors while computing" in the past couple days. This is a little odd, as I haven't seen this volume of errors in the past. I also discovered one of my "COSMOS" Machines (an array of PC's I have set up dedicated to run CPDN 24/7) had been running a hadsm for 700+ hours! So I aborted that task and noticed a significant loss of average credit on that machine (computer 1105294). The task ID for that "iceworld" was Task 11014309

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=11014309

This however (below),

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1104234

is the link to the machine that I also saw issues with. Nothing else in the system has changed in the past couple weeks so I'm not sure what would lead to the errors all of a sudden... Hope this helps you help me!
10) Questions and Answers : Windows : HADSM3 Model "stuck" (Message 40974)
Posted 5 Nov 2010 by Profile Aaron Doucett
Post:
Thanks for the input! Now that you speak of it, I did see a completely blue earth under temperature mode! I assumed it just wasn't computing anything...but I guess instead I had encountered one of these "ice worlds" .... how horrifying! :D

In either case, after a night of running I now am seeing seemingly completely different models running... very strange.

11) Questions and Answers : Windows : HADSM3 Model "stuck" (Message 40972)
Posted 5 Nov 2010 by Profile Aaron Doucett
Post:
Hi, I've been using boinc for about a month now and have had a lot of success, but this is the first time I've run into this issue:

On a machine with windows XP, I've been running a model
"hadsm3dhet2_jkrd_006589899_4"

For the past few hours or more, the model has been stuck at 95.324% done, and while the elapsed time keeps counting up, the to completion time does not follow. The time steps have changed, but at the rate of about one every 6 seconds... rather than my usual .4s per TS. The total elapsed time is now at 305 hours.

There are 4 models running on total simultaneously on this machine (Quad Core intel) and this one doesn't seem to want to finish, despite still using a 25% chunk of the CPU.

I've come this far... It would be a shame to have to abort! What should I do? Is there any hope?




©2024 climateprediction.net