climateprediction.net home page
Posts by Jacob Klein

Posts by Jacob Klein

1) Message boards : Number crunching : not uploading ressuts: Disk quota exceeded (Message 57940)
Posted 15 Mar 2018 by Jacob Klein
Post:
I have several PCs on my local network, because of this issue, continually uploading then failing, causing a nasty local bandwidth problem. I wonder if the backoffs aren't aggressive enough. Hope the project gets this figured out promptly. Frustrated. Thanks for listening.
2) Questions and Answers : Windows : Visual Fortran Runtime error (Message 53079)
Posted 12 Dec 2015 by Jacob Klein
Post:
BOINC itself must ALWAYS be suspended and exited from BEFORE allowing ANY OS update.

But there hasn't been any "short" models for months, so there's no risk of crashing one.


Why do you say that? What are the consequences of not suspending and exiting BOINC? It should just resume tasks successfully from their last checkpoints, and that seems to work fine for almost of my projects, I think.

If there's a BOINC problem, I'd like to know about, so please explain a bit further.

Thanks.
3) Message boards : Number crunching : HadCM3s post-completion artifacts (Message 52506)
Posted 7 Sep 2015 by Jacob Klein
Post:
I have a 4-year-old installation, where this project is using 17.4 GB, but Reset wasn't working, because BOINC did not know about the files. A recent checkin, however, will allow Reset to actually wipe out all of the files in the project directory, assuming you are not using Anonymous Platform. It did not make it into BOINC 7.6.9, but perhaps the next version, you'll be able to simply click Reset on this project, and it'll work better to clean up the files.

Of course, it'd be better if the project itself kept itself clean . . . I still don't understand why it doesn't. Oh well.

Look for Reset to work better, in the next version of BOINC, as an easier option of cleaning up this project.
4) Message boards : Number crunching : Must set rsc_memory_bound correctly (Message 48679)
Posted 2 Apr 2014 by Jacob Klein
Post:
That is not a bad idea. I have passed along the info to the dev team


It turns out, David liked the idea. He has implemented it too, so.. BOINC will probably start sending that data with the next release (7.3.16+).

It looks like it'll be saved in the state file as:
<peak_working_set_size>
<peak_swap_size>
<peak_disk_usage>

.. and will be sent to the server as:
<final_peak_working_set_size>
<final_peak_swap_size>
<final_peak_disk_usage>

Again, great idea!

http://boinc.berkeley.edu/gitweb/?p=boinc-v2.git;a=commit;h=b1a6fa39fc365b050141f5a89bf0d71a2a70303e

Client: keep track of job's peak WSS, swap size, and disk usage; send to server

Also fixed a bug where, if a job was aborted while not running,
its final CPU and elapsed time weren't copied from ACTIVE_TASK to RESULT,
hence not sent to scheduler
5) Message boards : Number crunching : Must set rsc_memory_bound correctly (Message 48669)
Posted 1 Apr 2014 by Jacob Klein
Post:
Correct. See post 2.
6) Message boards : Number crunching : Must set rsc_memory_bound correctly (Message 48665)
Posted 1 Apr 2014 by Jacob Klein
Post:
That is not a bad idea. I have passed along the info to the dev team, via the email below.


From: j...@msn.com
To: b..._alpha@ssl.berkeley.edu
Subject: Request - Report "maximum usage" variables when reporting tasks
Date: Tue, 1 Apr 2014 08:03:49 -0400

Below you will see an idea that came from the memory discussion on Climate Prediction's forum.

If I understand correctly, it would nice to have the client keep track of the "maximum working set used" and also maybe the "maximum virtual memory used", and then report those values back to the server, when reporting results. While I don't agree about them being displayed in stderr.txt, I do think it's a valid idea, and is one that could give feedback to the projects. It would be useful to users especially, to display those variables on the task's result page. Oh, and my idea is to do it for "maximum disk usage used" too :-p Thoughts on adding these 3 variables?




http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=7802#48664
------------------------------------------
Just an idea for one of the next core client betas : if the core client would insert a hint about the maximum memory usage it found for a workunit, it would help the project developers adjust their limits, i.e. something like :

<core_client_version>7.3.20</core_client_version>
<max_mem_usage_found>168570139</max_mem_usage_found>
<![CDATA[

...

I might be wrong but a tag outside of the CDATA value should not confuse the server side.
------------------------------------------
7) Message boards : Number crunching : Must set rsc_memory_bound correctly (Message 48663)
Posted 1 Apr 2014 by Jacob Klein
Post:
Thanks for understanding. I was a bit miffed to see most of my tasks get aborted, too, but as you said, it comes with the territory of being a tester. I'm glad you agree that it'd be wise to correct the work unit parameters.
8) Message boards : Number crunching : Must set rsc_memory_bound correctly (Message 48654)
Posted 1 Apr 2014 by Jacob Klein
Post:
As an Alpha tester, it is my responsibility to report problems as soon as I see them. In this case, I saw a problem (over half of my tasks were instantly aborted across various projects), it was caused by incorect rsc_memory_bound settings, and I reported it to various projects including yours, such that you guys would have as much time as possible to take the necessary action. At the time I reported the problem, we were going to keep the change, but as the 2nd post indicates, the change will be reverted.

I wasn't asking you to cater for me or for BOINC Alpha; I was trying to prevent a problem for your project's general user base, as we ramp up towards our public BOINC release.

I'd like to think you'd be less pessimistic about this. Perhaps I read your response wrong. It's been a long day.

Regards,
Jacob
9) Message boards : Number crunching : Must set rsc_memory_bound correctly (Message 48652)
Posted 1 Apr 2014 by Jacob Klein
Post:
It looks like this change is being reverted for now, per David's email below.
So, there is no longer an immediate need to correct the value...
But please consider setting it correctly at some point, in case it gets used by the client in the future.


> Date: Mon, 31 Mar 2014 18:53:33 -0700
> From: d...a@ssl.berkeley.edu
> To: b...c_alpha@ssl.berkeley.edu
> Subject: Re: [boinc_alpha] 7.3.14 - Heads up - Memory bound enforcement
>
> On further thought, I'm going to change things back to the way they were, namely
>
> 1) workunit.rsc_memory_bound is used only by the server;
> it won't send a job if rsc_memory_bound > host's available RAM
> 2) the client aborts a job if working set size > host's available RAM
> 3) the client will run a set of jobs only if the sum of their WSSs
> fits in available RAM
> (i.e. if a job's WSS is close to all available RAM,
> it would run that job and nothing else)
>
> The reason for not aborting jobs when WSS > rsc_memory_bound is that
> it requires projects to come up with very accurate estimates of RAM usage,
> which I don't think is feasible in general.
> Also, it will lead to lots of aborted jobs, which is bad for volunteer morale.
>
> -- David
10) Message boards : Number crunching : Must set rsc_memory_bound correctly (Message 48651)
Posted 1 Apr 2014 by Jacob Klein
Post:
ClimatePrediction Team:

You need to change your work unit parameters, to properly set <rsc_memory_bound> correctly. BOINC 7.3.14 alpha (and potentially future versions also) will read that value, and compare it to the Working Set size, and will auto-abort the work unit if it exceeds the bound.

As of right now, I am getting errors due to your incorrect settings.

For example:
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=16297167
Exit status 198 (0xc6) (EXIT_MEM_LIMIT_EXCEEDED)
<core_client_version>7.3.14</core_client_version>
<![CDATA[
<message>
working set size > workunit.rsc_memory_bound: 167.57MB > 118.26MB
</message>
<stderr_txt>

Could you please promptly fix this?

Regards,
Jacob Klein
11) Questions and Answers : Windows : Finished former projects removing (Message 47602)
Posted 18 Nov 2013 by Jacob Klein
Post:
If the project was added using an Account Manager, then the "Remove" button in BOINC Manager will be disabled (greyed out).

To remove the project, you can either:

a) Go to the Account Manager's website, remove it there, and in BOINC Manager, click Tools -> Syncrhonize with... to force a communication with the Account Manager to get the updated list of projects
or
b) In BOINC Manager, click Tools -> Stop using... to stop using the Account Manager, then you can Remove any of the Projects

Good luck!
Jacob
12) Message boards : Number crunching : A different upload problem -- out of disk space on rapid-watch.badc.rl.ac.uk (Message 47535)
Posted 11 Nov 2013 by Jacob Klein
Post:
It'll be resolved before 2/4/2014, but hopefully within a couple days/weeks.
13) Message boards : Number crunching : A different upload problem -- out of disk space on rapid-watch.badc.rl.ac.uk (Message 47515)
Posted 9 Nov 2013 by Jacob Klein
Post:
Recommending to turn off network activity is silly! Don't do that! BOINC intelligently backs off the upload-retry-intervals automatically.
I'm attached to 26 other projects, doing work for them just fine, and I (obviously) require network activity to both download new tasks and upload completed results.
14) Message boards : Number crunching : A different upload problem -- out of disk space on rapid-watch.badc.rl.ac.uk (Message 47513)
Posted 9 Nov 2013 by Jacob Klein
Post:
Thank you, and thanks for making sure someone is trying to fix it.

I will continue to monitor the upload every few days, and hopefully it gets resolved, especially before 2/4/2014.
15) Message boards : Number crunching : A different upload problem -- out of disk space on rapid-watch.badc.rl.ac.uk (Message 47509)
Posted 9 Nov 2013 by Jacob Klein
Post:
Okay, patience is needed.

So, the problem started around 11/6/2013.

The main question I have is:
How long will BOINC allow a download to fail until BOINC does something crazy like abort it? My wife has her first completed task queued up, and it took 640 hours (~27 days) to complete. I don't want to see all that work wasted.
16) Message boards : Number crunching : Trickle-up message (Message 46850)
Posted 23 Aug 2013 by Jacob Klein
Post:
The Data directory (where client_state.xml lives), is listed in one of the first 10 messages at BOINC startup, in the Event Log viewer.

Advanced View -> Advanced -> Event Log... -> Scroll to the top

Is it possible that the error is just a connectivity error? BOINC on my machine has had trouble connecting to the climateprediction project the last couple days, and I couldn't even connect to climateprediction forums or project status pages.




©2024 climateprediction.net