climateprediction.net home page
Another exit code -5

Another exit code -5

Questions and Answers : Windows : Another exit code -5
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Honza
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 390
Credit: 2,475,242
RAC: 0
Message 3967 - Posted: 12 Sep 2004, 14:55:53 UTC

Hi there,

today i experienced [another] exit code -5.
I think this one is a bit different in term of conditions. I opended large file in Photophop and made some operations where Photoshop had taken over 1GB RAM - I have 1.2GB available to Win, rest 800MB are dedicated to RAM DISK (exclueded memory).
This situation must have become stressfull for both Photoshop and BOINC (2 concurrent modells under P4 HT). When i realized that Photoshop needs more memory than i expected, i immediately shutted down Maxthon and BOINC. One model died (with premature upload upon BOINC restart), another went fine (and still going).
I relalized that i don\'t have swap file (but plenty of memory, heh - still not good for M$ Win).

My report also suggest a question: how is BOINC dealing with low-memory?
Can some exit code -5 be triggered by Win on low memory?
ID: 3967 · Report as offensive     Reply Quote
old_user1
Avatar

Send message
Joined: 5 Aug 04
Posts: 907
Credit: 299,864
RAC: 0
Message 3970 - Posted: 12 Sep 2004, 16:11:57 UTC
Last modified: 12 Sep 2004, 16:17:49 UTC

An exit -5 is from CPDN and not BOINC. Since CPDN should already have the 50MB or so memory it needs, perhaps it was a file I/O problem, i.e. the big Photoshop operation required making a big of swap space that perhaps conflicted with the model wanting to make files (i.e. the restart dump "checkpoint"), or the model wanted to read a file and "timed out." That's my only idea for what happened.

wow, as a followup I may have been right for once, here's the error output from your "yabsd.out" fresh from the upload server:

OPEN: **WARNING: FILE NOT FOUND
OPEN: Ignored Request to Open File dataout/restart.day for Reading

So something must have messed up the checkpoint writing the restart.day (is there a restart.day.zip file in the directory for this workunit # 003u_300025123)
ID: 3970 · Report as offensive     Reply Quote
Profile Honza
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 390
Credit: 2,475,242
RAC: 0
Message 3979 - Posted: 12 Sep 2004, 19:12:36 UTC - in response to Message 3970.  
Last modified: 12 Sep 2004, 19:14:00 UTC

Thanks for investigation, Carl.

Just couple of ideas:
- is it possible to simply perform a restart day/month/year in cases such as 'exit code -5' or perhaps some other errors. This may somehow slow-down the processing by computing extra checks but will likely increase number of completed runs.
- are all uploaded files locally stored ?
- is there a way to look into such verbose reports?

Question are somehow interconnected with aim to investigate more about possible BOINC/CPDN reasons of crashes and methods of avoiding them....
ID: 3979 · Report as offensive     Reply Quote
old_user1
Avatar

Send message
Joined: 5 Aug 04
Posts: 907
Credit: 299,864
RAC: 0
Message 3980 - Posted: 12 Sep 2004, 19:43:31 UTC - in response to Message 3979.  

actually I think all the good stuff i.e. error message is in yabsd.out which gets uploaded but may get erased on the hard drive when I zip everything up (the zip files are the .nc files and the final restart.day)
ID: 3980 · Report as offensive     Reply Quote

Questions and Answers : Windows : Another exit code -5

©2024 climateprediction.net