climateprediction.net home page
Model crashes

Model crashes

Message boards : Number crunching : Model crashes
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 33514 - Posted: 21 Apr 2008, 20:41:20 UTC

The pass mark is 1/10.
Cpdn news
ID: 33514 · Report as offensive     Reply Quote
Florian

Send message
Joined: 31 Mar 08
Posts: 1
Credit: 378,939
RAC: 0
Message 33626 - Posted: 1 May 2008, 9:19:06 UTC

Hello,
i am a german user an my english is not so good.
I have a probleme - today my model was crash.
I got follow news:
01.05.2008 10:18:14|climateprediction.net|Computation for task hadcm3istd_0h2m_1920_160_05939256_0 finished
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_1.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_2.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_3.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_4.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_5.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_6.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_7.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_8.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_9.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_10.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_11.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_12.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_13.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_14.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_15.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent
01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_16.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent

Can someone help me? Why i received this error? What can I do?
At the moment, a new model is running. Therefore I what to fix this error before I receive a second time the same error.

Thanks
Florian
ID: 33626 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 33629 - Posted: 1 May 2008, 13:00:13 UTC

Florian,

The final error messages have not yet appeared on the Web page for that model (hadcm3istd_0h2m_1920_160_05939256_0). When that happens we will be able to comment on the cause of the crash - it doesn\'t usually take this long ...

Iain
ID: 33629 · Report as offensive     Reply Quote
old_user1041

Send message
Joined: 25 Aug 04
Posts: 21
Credit: 288,382
RAC: 0
Message 33634 - Posted: 1 May 2008, 14:17:43 UTC

Also got a crash today, error code #22 like I saw in one of the posts above.

crashed WU

<core_client_version>6.1.16</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Not a JPEG file: starts with 0x01 0xda
Not a JPEG file: starts with 0x01 0xda
Suspended CPDN Monitor - Quit request from BOINC...

Model crashed: 

Model crashed: 

Model crashed: 

Model crashed: 

Model crashed: 

Model crashed: 
Sorry, too many model crashes! :-(

</stderr_txt>
]]>

ID: 33634 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 33636 - Posted: 1 May 2008, 15:08:02 UTC
Last modified: 1 May 2008, 15:09:48 UTC

BOINC version 6.1.16 isn\'t a public release yet, and this project won\'t be updating the server software to match for a while yet.

As for error code 22, apart from what is already in the READMEs, it hasn\'t been solved. It seems to be a code used for several failure modes.


Backups: Here
ID: 33636 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 33639 - Posted: 1 May 2008, 16:41:44 UTC
Last modified: 1 May 2008, 17:26:58 UTC

Hallo Florian, Grüsse aus England.

Your model crashed this morning but the BOINC manager messages don\'t explain why. The first line of the messages says only that it finished (=crashed). The other message lines mean that the server wanted all these files; but your model didn\'t process long enough to produce the files.

Has this computer\'s BOINC manager had network activity (internet) today? I think probably not. Please tell us.

Have you got a backup of the contents of the BOINC folder made before the model crashed? If you have a backup you can restore it and continue the model.


Cpdn news
ID: 33639 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 33641 - Posted: 1 May 2008, 17:09:57 UTC
Last modified: 1 May 2008, 17:32:36 UTC

Hi EJG

This is what moderator Thyme Lawn says about code 22.

I suspect the CPDN controller process is passing on what it gets from the worker. If the worker has been written with \'meaningful exit codes\' they could easily overlap with Unix error numbers (which are used to generate the messages). I\'ve written such programs myself, and I always start the exit codes from 1 and work upwards. Not a problem as long as the program interpreting the exit status knows what they mean, but as soon as you\'re dealing with something that doesn\'t know the rules (i.e. BOINC) everything goes to pot. I suspect that exit code 22 is catching a variety of file access problems.


So it\'s a file access problem of some sort but we don\'t know why. Something probably happened on your computer to cause the crash.

You can go to the project READMEs through the link in my signature. In the README collection called Crashes and Problems, look at link #6 by Mike Mars. He lists all the most probable causes of crashes.

The message line \'Not a JPEG file.....etc\' is not related to the crash. It simply means you opened the model graphics!

Have you got a backup of the contents of the BOINC folder made before the crash happened? If you have a backup you can restore it and continue the model.

Edit: this is the HADAM model that crashed:

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=7166370
The model was processing fast, 14 sec/TS. I know this is a fast computer, but is it also overclocked?

This computer has Vista. Where is BOINC installed? The default location for BOINC is C\\Program files\\BOINC, but Vista doesn\'t like that. With Vista the location should be changed to C\\BOINC.

I think you should uninstall BOINC version 6 and reinstall version 5 (unless you\'re a BOINC alpha tester).

Cpdn news
ID: 33641 · Report as offensive     Reply Quote
old_user1041

Send message
Joined: 25 Aug 04
Posts: 21
Credit: 288,382
RAC: 0
Message 33647 - Posted: 1 May 2008, 18:47:29 UTC - in response to Message 33641.  

Thanks for the answer and the links mo.v.

I never overclock PC\'s that are running Boinc, so that can not be the problem here. Unfortunately I stopped making regular backups when I stopped running Climate Prediction for some time and later only downloaded the shorter-running models. May be I have to start backing up again. :-)

This is the only computer that is running 6.x, partly because sometimes I do some alpha testing indeed, partly because 5.x is blocked at Vista startup (although there seems to be a work around for that) because it asks for admin rights. For now I don\'t expect 6.16 to cause any problems.

Boinc is not installed in c:\\program files, but on the D drive. Vista seems happy. :-)

I will try another model and see what happens.
ID: 33647 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 33649 - Posted: 1 May 2008, 22:05:41 UTC

I think it\'s worth making backups even of the \'shorter\' models. For example, when you exit from BOINC to run an AV scan, defrag and reboot, that\'s a good moment. In the README collection about backups, the easy manual method described by Les only takes a few minutes. In my experience, backups made by this method always restore successfully. There\'s also a selection of more sophisticated methods.


Cpdn news
ID: 33649 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 33691 - Posted: 6 May 2008, 17:09:07 UTC - in response to Message 33629.  

Florian,

The final error messages have now yet appeared on the Web page for your model (hadcm3istd_0h2m_1920_160_05939256_0).

The exit status is \"-1073741819 (0xc0000005)\", which is sometimes related to graphics compatibility. Try item #7 in the \"crashes and other problems\" README thread (which is here).

I notice that the model was started and stopped quite a few times during its short history: some crashes can be avoided by stopping BOINC manually before closing the PC down.

Iain
ID: 33691 · Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Model crashes

©2024 climateprediction.net