climateprediction.net home page
CPDN hogging disk

CPDN hogging disk

Questions and Answers : Preferences : CPDN hogging disk
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 46060 - Posted: 27 Apr 2013, 2:50:28 UTC

In preferences I set the max disk space to 4GB several weeks ago, but the BOINC Manager shows CPDN is taking 13GB.
ID: 46060 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 46061 - Posted: 27 Apr 2013, 3:08:27 UTC - in response to Message 46060.  

Unfortunately, when models crash they don't clean up after themselves. That has to be done manually.

Open the BOINC manager to the Tasks tab.
Open your folders browser. (Windows Explorer?)
Look at the climateprediction.net folder.
Compare the model names with those that are in your Tasks.
Delete any folders that AREN'T in Tasks.

Having said that though, if you run 4 hadcm3n models at a time, as the climate data accumulates it can get to several Gigs each just before being zipped and returned to the project, so a total of 4 gigs really isn't enough.


ID: 46061 · Report as offensive     Reply Quote
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 46064 - Posted: 27 Apr 2013, 13:36:37 UTC - in response to Message 46061.  

Thanks Les. But, where is the climateprediction.net folder? Sorry that I'm not more familiar with my folder structure. By the way, I have blocked new hadcm3n tasks as per advice in another thread, because they almost always crash on my (very beefy) computer after running for a very long time!
ID: 46064 · Report as offensive     Reply Quote
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 46065 - Posted: 27 Apr 2013, 13:40:39 UTC - in response to Message 46064.  

Also, is there a way I can limit the space allowed to CPDN, to make room for tasks from other BOINC projects? Or, are we only able to set the disk allotment for all BOINC projects in aggregate?
ID: 46065 · Report as offensive     Reply Quote
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 46066 - Posted: 27 Apr 2013, 14:05:02 UTC - in response to Message 46065.  

Nevermind. I found the BOINC data folder.
ID: 46066 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 46067 - Posted: 27 Apr 2013, 16:07:10 UTC

Hi Steve

You can decide yourself what proportion of the computer's BOINC CPU time is devoted to each project. In your CPDN account (there's a link to it in the blue menu to the left) go to the climateprediction.net preferences. Right at the top you'll see Resource share. I think the default for each project is 100. This isn't a percentage; it's a proportion of the whole. If CPDN's your only project, 100 will be all of it. If you have two projects each with a share of 100, each will get 50%. You can increase or decrease the 100 to suit what you want. To vary the share for other projects you'll have to go into your accounts there.

In your BOINC Manager Projects tab you'll see the current resource allocations.

BOINC tries to respect our resource allocations over the medium-long term.
Cpdn news
ID: 46067 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 46068 - Posted: 27 Apr 2013, 16:19:30 UTC
Last modified: 27 Apr 2013, 16:21:28 UTC

Steve, because you said a lot of your CPDN models had crashed I've looked at the tasks for your computer. You've had crashes for models that have succeeded on other computers. Your models have mostly crashed with error codes 1073741819 and 193. That's a nice computer but something isn't quite right.

http://boincfaq.mundayweb.com/index.php?language=1&view=98

Is your computer overclocked? Probably not, but if it is you should test it for stability.

Do you exit from BOINC completely before you turn off the computer?

Is BOINC excluded from AV scans?

I think your BOINC version could be a problem. You have 7.0.25 which AFAIK is a testing alpha version, not a public release stable version. It isn't being tested any longer so there's no need to keep it. Get the latest stable release from the main BOINC page and see whether that helps:

http://boinc.ssl.berkeley.edu/
Cpdn news
ID: 46068 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 46080 - Posted: 27 Apr 2013, 21:34:33 UTC - in response to Message 46065.  

If you limit the space by cutting back on the amount of disk space, then the models will crash when they run out of room.

To limit the space used by cpdn, you'll have to limit the number of models run at any time. To do this requires "micro managing" them, because you're running multiple projects.
This requires turning No new tasks on and off, and altering one of your preference settings back and forth.

First, in your account page, unselect the Coupled Ocean model, and just below that If no work for selected applications is available, accept work from other applications?
This will stop you from getting the long models with their large amounts of data.

Next:
In the Projects tab of the BOINC manager, keep the setting for cpdn set at No new tasks.
Either on the project web site, or in your manager's menu, have the number of processors set for the maximum.

When you want/need a cpdn model:
First check the server status to see if there IS any work available. There's no point in all of the following if there isn't.


1) Set the network connection to Network activity suspended.
2) In the prefs of your choice, set the number of processors to 1 or 2.
3) In the Projects tab, click on the cpdn project, and then click the Update button.
4) Check the messages tab (or the messages window in the newest BOINC version) and wait until it gets the message about the changed number of processors. It will then run a new benchmark.

5) Set the project to allow new work. (Projects tab)
6) Set the network to Network activity always available.

BOINC should now ask for work from cpdn, but only get the 1 or 2 models that you set earlier.

When you've got them, set Network off again, and also set No new work.
Then go to your prefs setting and change the number of processors back to the maximum, so that your other WUs will also run.

The first hundred times are the hardest. After that it's easy.
:)


ID: 46080 · Report as offensive     Reply Quote
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 46081 - Posted: 27 Apr 2013, 23:01:53 UTC - in response to Message 46068.  

Hi Mo, thanks for the suggestions.

I see that my other projects have finally given me tasks, after I deleted the old CPDN folders left over after crashes that were hogging my BOINC disk allocation.

On those crashes:

My computer is not over clocked.

I normally shut down my computer without bothering to close applications. Does this upset BOINC? If so, I will be careful to close BOINC in the future.

I have not excluded BOINC from AV scans. Should I do that?

I normally only update applications when I am absolutely forced to; I have been burned too many times by updates that ruined a perfectly good app version. But, I took your advice and updated BOINC - I didn't realize that I was running an alpha version! (Another rule I have is to never run alpha or beta versions of any software!).

Les, thanks for all the detailed instructions. I didn't want to make a career of running BOINC, but I see that if I want to have some fun running multiple projects then I need to get a little more familiar with how things work. One thing that seemed to confuse me was the difference between BOINC preference settings and Project (e.g., CPDN) settings - it seems like both are accessible from the CPDN page. Anyway, I think you have given me enough info to sort it out. Thanks again - I appreciate your dedication.

One suggestion: I notice that other projects send me notices when my disk allocation is insufficient (which happened because CPDN was hogging 13GB with orphan tasks!). Why can't CPDN suspend processing and send a notice when disk allocation is running low?
ID: 46081 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 46082 - Posted: 28 Apr 2013, 0:00:34 UTC - in response to Message 46081.  

Dear Steve:

The answer to both your questions is YES!!! Climate models, especially the Hadcm3n�s are touchy. They can fail if boinc is stopped without exiting the model first. This seems to happen because boinc (or Windows in a reboot) shuts down to fast for the model to write all of the data to the disk that it needs to restart the model.

The answer to your second question is also yes. Antivirus programs have a long history of crashing models. When an AV program is scanning a file that file is temp. locked. If the CP model tries to write to that file while it is locked then it will crash.

In either case, unless you have a backup made before the crash then it is RIP model.

ID: 46082 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 46083 - Posted: 28 Apr 2013, 2:43:16 UTC - in response to Message 46081.  

Your question/idea about Notices is only relevant to projects whose server code is up to date.
Not only is cpdn's a few versions too old, it is also highly customised.

And Notices only apply to people using a version 7 of BOINC. Lots of people still use V6, and a few even still have V5.

There will be messages in the Messages window on your version, accessible from the menu somewhere. (And in the messages Tab for people with V6 and V5.)


ID: 46083 · Report as offensive     Reply Quote
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 46088 - Posted: 28 Apr 2013, 13:23:51 UTC - in response to Message 46083.  

Thanks Les, I added both BOINC folders (both in Program Files and ProgramData) to my AV exceptions list. One question, when I exit BOINC Manager, does it also close down my running tasks? It doesn't seem to, since when I relaunch the manager the tasks seems to be running without any delay to scrounge stuff from disk.
ID: 46088 · Report as offensive     Reply Quote
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 46089 - Posted: 28 Apr 2013, 13:32:07 UTC - in response to Message 46088.  

Oh, I meant to ask one other question:

It would seem that the obvious solution to the problem of occasional model corruption (due to too hasty shutdowns, or whatever) would be for CPDN to make automatic periodic backups on the client computer, say, every day or so; then, when a model failed to load, the program could automatically retrieve from backup. I'm sure you guys must have thought of this. Why couldn't that work? It is really disappointing to run a big model for weeks, only to have it ultimately fail.
ID: 46089 · Report as offensive     Reply Quote
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 46090 - Posted: 28 Apr 2013, 14:33:14 UTC - in response to Message 46089.  

Sorry, I see that I have been habitually closing BOINC by hitting the "x", rather than shutting down properly. I guess I got in the habit of doing that because the "x" is the only way of closing my browser, Chrome.
ID: 46090 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 46091 - Posted: 28 Apr 2013, 14:48:34 UTC - in response to Message 46089.  

Automatic backups would require some way of discriminating between host-machine causes and buggy parameters/models. Some tasks are just born to crash, and we're seeing this right now with some recent hadcm3n's.
ID: 46091 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 46092 - Posted: 28 Apr 2013, 16:17:41 UTC

Yes, a lot of members have asked whether CPDN or BOINC could carry out its own backups of the contents of the BOINC Data folder but it just isn't possible.

Well done for adding the two BOINC folders to your AV scan exceptions. Another way of doing things is only to run AV scans after exiting first from BOINC. But using the exceptions list is better because you can then let the AV run scans whenever it wants.

To go back to your question about exiting from BOINC. If you have the BOINC Manager open and click on the X in the top right corner, this closes the BOINC Manager but it doesn't stop BOINC itself from running. The tasks, whichever project they're from, will still keep running. This is what we have to stop before closing down the computer. So to exit completely from BOINC:

* Inside BOINC Manager first stop the tasks running. In the Activity menu select Suspend.
* Right-click on the BOINC icon then select Exit. The icon will disappear and you can shut down the computer.
* Another way to exit completely from BOINC. In BOINC Manager File > Exit. The icon will disappear and you can shut down the computer.

* Depending on how you have BOINC installed, when you restart the computer the BOINC icon may not reappear on its own. But you can always find it in the Start menu in the list of all your programs. You'll then need to go back to the BOINC Manager Activity menu to make the tasks run again.

The reason for exiting completely from BOINC before shutting down the computer is that Windows closes everything down really fast and this might catch the climate models just at the very moment when they're trying to write data to the disk. As you can imagine, the models don't like it and sooner or later this can cause a model crash.
Cpdn news
ID: 46092 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 46095 - Posted: 28 Apr 2013, 20:30:32 UTC - in response to Message 46088.  

when I exit BOINC Manager, does it also close down my running tasks?

That depends on whether or not you ticked the option to do that when you installed BOINC. Or since.

The tasks are run/controlled by the core client, which is a background process. The job of the Manager is to show the user what the core client is doing.
Some people run without the manager. e.g. When they're running lots of computers in a school, and they don't want people interfering with the work.

In V6 there's a chance to select/deselect the client shut down when you Exit from BOINC.
I'm not sure about V7.
Also, I think that it's possible to get at the option from the menu., but I'm not sure where it is.


ID: 46095 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 46096 - Posted: 28 Apr 2013, 20:34:25 UTC - in response to Message 46089.  

Backups

This sticky post has been there a long time.
Things might be a bit different now, with the new versions of both OS and BOINC.


ID: 46096 · Report as offensive     Reply Quote
Steve Wenner

Send message
Joined: 2 Mar 06
Posts: 27
Credit: 240,040
RAC: 0
Message 46100 - Posted: 29 Apr 2013, 0:02:03 UTC - in response to Message 46091.  

Belfry, I would think that it would not be difficult to distinguish between a model bug and a problem on the host. If the crash was caused by a bug, then when the model was started from a backup it would crash again at the same point; however, if the problem was a host glitch, then the program would sail merrily along past the previous fail point.

I see from Les's backup post that in 2006 Richard Rodway had an automatic backup utility. Has that panned out, or is there something more recent?

ID: 46100 · Report as offensive     Reply Quote
Belfry

Send message
Joined: 19 Apr 08
Posts: 179
Credit: 4,306,992
RAC: 0
Message 46104 - Posted: 29 Apr 2013, 1:58:41 UTC

As I understand it, hadcm3n's most often process through errors and then crash at 25, 50, 75 and 100% points. So stopping times alone won't be enough to tell interrupted tasks from buggy ones. In any case some of the functionality you ask for already happens at the project level: crashed tasks simply get reissued to different computers, with the general hope that hardware and OS diversity increases the chances for completion.
ID: 46104 · Report as offensive     Reply Quote
1 · 2 · Next

Questions and Answers : Preferences : CPDN hogging disk

©2024 climateprediction.net