Message boards :
Number crunching :
Model crashes
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
The pass mark is 1/10. Cpdn news |
Send message Joined: 31 Mar 08 Posts: 1 Credit: 378,939 RAC: 0 |
Hello, i am a german user an my english is not so good. I have a probleme - today my model was crash. I got follow news: 01.05.2008 10:18:14|climateprediction.net|Computation for task hadcm3istd_0h2m_1920_160_05939256_0 finished 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_1.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_2.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_3.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_4.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_5.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_6.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_7.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_8.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_9.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_10.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_11.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_12.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_13.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_14.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_15.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent 01.05.2008 10:18:14|climateprediction.net|Output file hadcm3istd_0h2m_1920_160_05939256_0_16.zip for task hadcm3istd_0h2m_1920_160_05939256_0 absent Can someone help me? Why i received this error? What can I do? At the moment, a new model is running. Therefore I what to fix this error before I receive a second time the same error. Thanks Florian |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
Florian, The final error messages have not yet appeared on the Web page for that model (hadcm3istd_0h2m_1920_160_05939256_0). When that happens we will be able to comment on the cause of the crash - it doesn\'t usually take this long ... Iain |
Send message Joined: 25 Aug 04 Posts: 21 Credit: 288,382 RAC: 0 |
Also got a crash today, error code #22 like I saw in one of the posts above. crashed WU <core_client_version>6.1.16</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Not a JPEG file: starts with 0x01 0xda Not a JPEG file: starts with 0x01 0xda Suspended CPDN Monitor - Quit request from BOINC... Model crashed: Model crashed: Model crashed: Model crashed: Model crashed: Model crashed: Sorry, too many model crashes! :-( </stderr_txt> ]]> |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
BOINC version 6.1.16 isn\'t a public release yet, and this project won\'t be updating the server software to match for a while yet. As for error code 22, apart from what is already in the READMEs, it hasn\'t been solved. It seems to be a code used for several failure modes. Backups: Here |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hallo Florian, Grüsse aus England. Your model crashed this morning but the BOINC manager messages don\'t explain why. The first line of the messages says only that it finished (=crashed). The other message lines mean that the server wanted all these files; but your model didn\'t process long enough to produce the files. Has this computer\'s BOINC manager had network activity (internet) today? I think probably not. Please tell us. Have you got a backup of the contents of the BOINC folder made before the model crashed? If you have a backup you can restore it and continue the model. Cpdn news |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi EJG This is what moderator Thyme Lawn says about code 22. I suspect the CPDN controller process is passing on what it gets from the worker. If the worker has been written with \'meaningful exit codes\' they could easily overlap with Unix error numbers (which are used to generate the messages). I\'ve written such programs myself, and I always start the exit codes from 1 and work upwards. Not a problem as long as the program interpreting the exit status knows what they mean, but as soon as you\'re dealing with something that doesn\'t know the rules (i.e. BOINC) everything goes to pot. I suspect that exit code 22 is catching a variety of file access problems. So it\'s a file access problem of some sort but we don\'t know why. Something probably happened on your computer to cause the crash. You can go to the project READMEs through the link in my signature. In the README collection called Crashes and Problems, look at link #6 by Mike Mars. He lists all the most probable causes of crashes. The message line \'Not a JPEG file.....etc\' is not related to the crash. It simply means you opened the model graphics! Have you got a backup of the contents of the BOINC folder made before the crash happened? If you have a backup you can restore it and continue the model. Edit: this is the HADAM model that crashed: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=7166370 The model was processing fast, 14 sec/TS. I know this is a fast computer, but is it also overclocked? This computer has Vista. Where is BOINC installed? The default location for BOINC is C\\Program files\\BOINC, but Vista doesn\'t like that. With Vista the location should be changed to C\\BOINC. I think you should uninstall BOINC version 6 and reinstall version 5 (unless you\'re a BOINC alpha tester). Cpdn news |
Send message Joined: 25 Aug 04 Posts: 21 Credit: 288,382 RAC: 0 |
Thanks for the answer and the links mo.v. I never overclock PC\'s that are running Boinc, so that can not be the problem here. Unfortunately I stopped making regular backups when I stopped running Climate Prediction for some time and later only downloaded the shorter-running models. May be I have to start backing up again. :-) This is the only computer that is running 6.x, partly because sometimes I do some alpha testing indeed, partly because 5.x is blocked at Vista startup (although there seems to be a work around for that) because it asks for admin rights. For now I don\'t expect 6.16 to cause any problems. Boinc is not installed in c:\\program files, but on the D drive. Vista seems happy. :-) I will try another model and see what happens. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
I think it\'s worth making backups even of the \'shorter\' models. For example, when you exit from BOINC to run an AV scan, defrag and reboot, that\'s a good moment. In the README collection about backups, the easy manual method described by Les only takes a few minutes. In my experience, backups made by this method always restore successfully. There\'s also a selection of more sophisticated methods. Cpdn news |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
Florian, The final error messages have now yet appeared on the Web page for your model (hadcm3istd_0h2m_1920_160_05939256_0). The exit status is \"-1073741819 (0xc0000005)\", which is sometimes related to graphics compatibility. Try item #7 in the \"crashes and other problems\" README thread (which is here). I notice that the model was started and stopped quite a few times during its short history: some crashes can be avoided by stopping BOINC manually before closing the PC down. Iain |
©2024 climateprediction.net