climateprediction.net home page
Posts by DadX

Posts by DadX

1) Message boards : Number crunching : East Asia testing. (Message 68937)
Posted 23 Jun 2023 by DadX
Post:
Another 3 east Asia task bit the dust after 2 minutes or less.
I'm running Win 11, I5 (6 core), with 8GB of memory of which more than 4GB is available.

Regards,
DadX
2) Message boards : Number crunching : East Asia testing. (Message 68930)
Posted 23 Jun 2023 by DadX
Post:
Correction
I had two tasks that failed wah2_eas25_a3ny_201511_25_994_012220188 and wah2_eas25_a1cm_199611_25_994_012217188 after 2 minutes.
Regards,
DadX
3) Message boards : Number crunching : East Asia testing. (Message 68929)
Posted 23 Jun 2023 by DadX
Post:
I had 2 wah2_eas25_a3ny_201511_25_994_012220188 tasks and both crashed after 2 minutes. It's having trouble uploading them.

Regards,
DadX
4) Message boards : climateprediction.net Science : WCG Climate project (Message 61482)
Posted 7 Nov 2019 by DadX
Post:
From my understanding there are a couple of things creating the scarcity of WU.
1. The normal somewhat limited number of WU common with a project start,
2. To some degree the new WU are created based on the information returned by prior WU.
3. The results are huge and it take time and space to process the results, limiting the throughput.
4. WCG has a large pool of crunchers so new WU get snapped up quickly.

I've gotten 10 WU ( 6 completed and returned ) across 2 machines, allocating only 3 CPUs (total) to the project. To do that I've had to set something up to poll for work every 10 minutes. I get enough work to keep the 3 CPUs busy with the project most of the time. Someday I'll make the time to write a script (Windows OS) to only poll when I need the work.
5) Message boards : Number crunching : Abnormally long-running models (Message 57892)
Posted 5 Mar 2018 by DadX
Post:
CPDN tasks have exhibited a bunch of strange behaviors over the years and there are a lot of tricks to goose the tasks when they appear to hang. It is usually best to give the task a couple of multiples of a trickle time to see if there was just a long tail (unlikely but possible). Other things I have tried with varying degrees of success are below NOTE: MAKE SURE THE "Leave Non-GMO Tasks in memory while suspended" box is checked before you try these.

1. Suspend the task for a few minutes then allow it to run. It might complete, it might lose a few % points and have to run to completion and it might have no affect.

2. Cleanly exit the application, allowing it to shut down the tasks as it closes. If you have installed BOINC as a service you'll probably have to reboot to get the services to restart, otherwise just restart the app (BOINC) and see what happens. Potential results are the same as above with the added possibility that it will exit with a fatal error, which will leave you in the same place as an abort.

In both cases give it an hour or two, to see if it made a difference.
6) Message boards : Number crunching : Credit Status (Message 57265)
Posted 30 Oct 2017 by DadX
Post:
Love the pts/hr since the credits were restored. But if it's not intentional then there's probably something that still needs looking into. My most recently completed task and the one in progress both got a hefty lift in point/hour, somewhere between 2-3 times better than what I've received on the same machine for prior work. Don't get me wrong, I love it. It brings the credits awarded (per hour) for this project from one of the worst to one of the best.

And just in case someone want to look at it the WUs are
wah2_pnw25_rlr2_200412_24_666_011289809 (Completed)
wah2_pnw25_rib1_200212_24_666_011285344 (in Progress)
7) Message boards : Number crunching : New work Discussion (Message 55484)
Posted 13 Jan 2017 by DadX
Post:
To solve the prioritization problem with CPDN and other projects I use to run a Linux VM on Windows using VirtualBox. I have a 4 core value machine so I gave Linux one core, installed Boinc and used it to run CPDN. This in effect limited CPDN to one core and other projects running in Windows to 3. It did increase run times somewhat but I got the segregation I wanted and left reasonable performance for non-Boinc work. With models running under Linux being phased out of CPDN I no longer bother with this. You could run a Windows VM on a Windows host but that requires two Windows licenses which I don't wish to pay for.
8) Message boards : Number crunching : Credit (Message 54927)
Posted 12 Oct 2016 by DadX
Post:
The last trickle for my last WU (wah2_eu25_9d3f_208912_13_403_010567433_1) was posted on 9/26. I seem to have all the credit for it awarded in CPDN but the last time I see credit in BOINCSTATS for CPDN was 9/24. Is this a BOINCSTAT issue or is the problem someplace else?

9) Message boards : Number crunching : Project communication failed (Message 54865)
Posted 29 Sep 2016 by DadX
Post:
I was wonder about those "Suspended CPDN Monitor - Suspend request from BOINC..." messages. Even with successful completions I get pages and pages of them and I do have LAIM checked. Why would are they happening?

10) Message boards : Number crunching : Windows - Not checkpointing (Message 52438)
Posted 18 Aug 2015 by DadX
Post:
I have one UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10 that has not sent a trickle or written a checkpoint after close to 100 hours elapse time. It's not making much progress either. Should I cancel this task or just let it run?

Thanks
DadX
11) Message boards : Number crunching : Does the Tasks tab still chew CPU, and slow down computation? (Message 50703)
Posted 1 Nov 2014 by DadX
Post:
I have been running these climate WU on a 1GB (with ample virtual memory) formally Windows XP now Lubuntu desktop for many years and they complete. Most of the time I run them concurrently with WCG. Like many others my WU result list is littered with download errors and incompletes (short models, PNW). When running this project on Windows I do set my virus scan to exclude the BOINC data directories and I pause the calculations when a full disk virus scan is scheduled to run. I am also careful to avoid shutdowns near the 25,50,75 percent complete points. Of course it can take weeks or months to complete the long models but it is a 10 year old "value" class machine.
12) Message boards : Number crunching : Credit updates? (Message 49523)
Posted 9 Jul 2014 by DadX
Post:
Alas, no credit updates again. Trickles but no credit update on the CPDN site or BOINCStats.

Was that most recent credit update a onetime manual run or is it suppose to run regularly again?

Thanks
DadX
13) Message boards : Number crunching : Trickles not updating credits (Message 49313)
Posted 7 Jun 2014 by DadX
Post:
It seems that the trickle and completed credits for Workunit 8896123 stopped updated at some point in the last few days.
14) Message boards : Number crunching : UK Met Office HADAM3P (global only) with MOSES II landsurface scheme v7.03 (Message 49100)
Posted 14 May 2014 by DadX
Post:
Will the results of these tasks have any value or should we just abort them as they appear to help flush them from the system?
15) Message boards : Number crunching : UK Met Office HADAM3P (global only) with MOSES II landsurface scheme v7.03 (Message 49013)
Posted 1 May 2014 by DadX
Post:
How about running them in a VM and saving the machine state (Virtual Box) when a reboot is required? I can set this up this weekend if nobody has tried it yet
16) Message boards : Number crunching : Credit updates? (Message 48472)
Posted 20 Mar 2014 by DadX
Post:
I've sent 5 trickles for workunit 16369306 and 1 for workunit 8714953, but my claimed and granted credits are both null. Perhaps there is an accumulator routine asleep?
17) Message boards : Number crunching : Task ... exited with zero status but no 'finished' file (Message 48124)
Posted 8 Feb 2014 by DadX
Post:
On my only PNW WU from the most recent batch I got a boat load of these messages:
2/7/2014 9:35:43 PM | climateprediction.net | Starting task hadam3p_pnw_ucto_2007_1_008509672_1 using hadam3p_pnw version 722 in slot 2
2/7/2014 9:35:57 PM | climateprediction.net | Task hadam3p_pnw_ucto_2007_1_008509672_1 exited with zero status but no 'finished' file
2/7/2014 9:35:57 PM | climateprediction.net | If this happens repeatedly you may need to reset the project.


Then 13 messages ( one for each zip) like:
2/7/2014 10:00:21 PM | climateprediction.net | Output file hadam3p_pnw_ucto_2007_1_008509672_1_1.zip for task hadam3p_pnw_ucto_2007_1_008509672_1 absent

The the WU rolled over and died. Any guess as to why?

Windows 7 64bit
12GB memory with 7GB available.
AMD A6 running 2 WCG taks and a Linux session in VirtualBox
CPU utilization at 75% on average before the PNW task
18) Message boards : Number crunching : VANISHING WU'S (Message 47993)
Posted 16 Jan 2014 by DadX
Post:
Thanks mo.v
I had in mind the latter, not the former, but didn't think the process through.
On my "value" machines and with my workday it would be rare indeed if there was more than one or two WUs to abort per day, let alone per CPU.

Energise main phasers, All weapons to full power.
Target that WU (hadcm3n_7k6p_1980_40_008437444_1)
Phaser one, fire.
Phaser two, fire.

Target destroyed.
Confirmed.
19) Message boards : Number crunching : VANISHING WU'S (Message 47987)
Posted 15 Jan 2014 by DadX
Post:
If I keep on aborting these jobs will my machines get blacklisted?
20) Message boards : Cafe CPDN : Happy Christmas, happy holidays and happy new year! (Message 47862)
Posted 24 Dec 2013 by DadX
Post:
Here inn New York City we have a strange tradition. A local TV station will broadcast logs burning in a fireplace (a Yule log) from 9:00AM to 1:00 PM on Christmas morning. Yup, burning fossil fuels to generate electricity to watch a continuous loop of a dead burning tree, by people with no "fireside" traditions because they've never been near a real fireplace or campfire in their life.
On the up side it's cold here and if they're watching it with an old tube TV or a plasma screen it will add some heat to their home.


Next 20

©2024 climateprediction.net