climateprediction.net home page
Frustrating that you can't shut down without error

Frustrating that you can't shut down without error

Message boards : Number crunching : Frustrating that you can't shut down without error
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user30961
Avatar

Send message
Joined: 21 Nov 04
Posts: 6
Credit: 902,269
RAC: 0
Message 41556 - Posted: 29 Jan 2011, 18:00:42 UTC

It is extremely frustrating that you cannot shut down BOINC without causing a climate model to error. It does not always happen, but it happens too frequently.

I have tried suspending activity, waiting for a period of time and then exiting the BOINC process. I saw a thread about making a change in the registry and I also did that. It did not help.

I used to shut down BOINC in order to allow disk defragmentation to take place, but I find that simply suspending activity for an hour or so without exiting BOINC works about as well. Unfortunately, Windows and other program updates sometimes make a shutdown necessary. If it weren't for this problem, I would have almost no errored models.

Chuck
ID: 41556 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1079
Credit: 6,904,049
RAC: 6,657
Message 41557 - Posted: 29 Jan 2011, 19:16:25 UTC - in response to Message 41556.  

Unfortunately, Windows and other program updates sometimes make a shutdown necessary. If it weren't for this problem, I would have almost no errored models.
Turning off automatic Windows updates does help reduce errors. Shutting BOINC down manually as well should reduce the error rate to a very low level.

(FAMOUS models have a relatively high repeatable error rate that is nothing to do with the computer itself - anyone with the same setup would get the error.)
ID: 41557 · Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 41567 - Posted: 31 Jan 2011, 0:16:10 UTC
Last modified: 31 Jan 2011, 0:35:10 UTC

Couple things you can do. In the BOINC manager, in advanced view, goto the Advanced menu and select "Shutdown connected client..." before you backup/defrag/shutdown your machine. I always shutdown BOINC during a backup, so the BOINC restarts in a consistent state, should I ever need to restore.

Or, a less preferred option:
In preferences, set the "leave applications in memory" to off. When you suspend work (after at least 1 checkpoint), the application will unload from memory cleanly.

Also, if you installed BOINC as a service, you may want to check registry key:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\WaitToKillServiceTimeout
and set to at least 30000. Must be of type REG_SZ.
ID: 41567 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 41577 - Posted: 31 Jan 2011, 23:35:32 UTC

I believe it's only possible to defragment the Boinc directories (as well as everything else) if one exits from Boinc first. If Boinc is left running it does no harm but it doesn't get defragmented.

I know that some people run background backups with Boinc running; apparently this method produces restorable backups most of the time but not invariably. I take manual backups, copying and pasting. In Windows I've found that if I forget to exit from Boinc first, the pasting fails and I have to exit from Boinc and start again. The advantage is that in my experience a manual backup can always be restored.

Exiting from Boinc before shutting down the computer should only take a few moments. The problem is that there's nothing in Boinc itself to indicate that this is necessary, at least for CPDN, and lots of members probably don't realise.


Cpdn news
ID: 41577 · Report as offensive     Reply Quote
old_user30961
Avatar

Send message
Joined: 21 Nov 04
Posts: 6
Credit: 902,269
RAC: 0
Message 41580 - Posted: 1 Feb 2011, 16:07:02 UTC

I did everything suggested. Did the "Shutdown connected client". I have always had "leave applications in memory" off. It doesn't make any difference. A model still errored as soon as I restarted BOINC.

I see messages like this in STDERR

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=3456, iMonCtr=1
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=0, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2584, selfPID=4632, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
11:01:12 (4632): called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
<file_name>hadam3p_eu_xexy_1981_1_006966974_0_5.zip</file_name>
<error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
<file_name>hadam3p_eu_xexy_1981_1_006966974_0_6.zip</file_name>
<error_code>-161</error_code>
</file_xfer_error>

Followed by the list of all missing ZIP files.
ID: 41580 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1079
Credit: 6,904,049
RAC: 6,657
Message 41582 - Posted: 1 Feb 2011, 17:44:57 UTC

For reference: hadam3p_eu_xexy_1981_1_006966974_0.

The next thing to check is that the BOINC application and data folders are excluded from any virus checking. A virus checker is just the kind of thing that could intervene when BOINC starts.
ID: 41582 · Report as offensive     Reply Quote
old_user30961
Avatar

Send message
Joined: 21 Nov 04
Posts: 6
Credit: 902,269
RAC: 0
Message 41587 - Posted: 1 Feb 2011, 21:23:51 UTC

C:\Program Files\BOINC and C:\Programdata\BOINC have always been excluded from virus scanning (Kaspersky). SETI never has this problem. I have to believe that there is some flaw in the Climate Models that is not handling a shutdown properly.

Chuck
ID: 41587 · Report as offensive     Reply Quote
transient

Send message
Joined: 3 Oct 06
Posts: 43
Credit: 8,017,057
RAC: 0
Message 41589 - Posted: 2 Feb 2011, 6:44:56 UTC

I can shutdown without models erroring out. I have no idea why you can't. I do know I do not do anything special before shutting down. I know this remark doesn't help you, but it is my comment on the subject title of this thread.
ID: 41589 · Report as offensive     Reply Quote

Message boards : Number crunching : Frustrating that you can't shut down without error

©2024 climateprediction.net