climateprediction.net home page
Sulphur Cycle keeps crashing

Sulphur Cycle keeps crashing

Questions and Answers : Windows : Sulphur Cycle keeps crashing
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user21418

Send message
Joined: 27 Sep 04
Posts: 4
Credit: 38,854
RAC: 0
Message 20797 - Posted: 27 Feb 2006, 12:08:03 UTC

Hi,

I seem to be having trouble running the Sulohur cycle experiment - I\'ve had several models all crash within a few days. Is there anything I can do to improve their chances?

I\'m running Bonic vn 4.45 on XP.

Cheers
ID: 20797 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 20802 - Posted: 27 Feb 2006, 14:48:30 UTC

The following *may* help, although they\'re just guesswork at this point

* Update to Boinc 5

* If you use Norton Antivirus, Sophos, then add the Boinc directories to the exclusion list

* Test your system with Prime95

* If you see a microsoft \'Send/Don\'tSend\' dialogue, don\'t select anything until you have killed the various boinc processes.

Is there anything interesting in the \'messages\' tab?

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 20802 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 20805 - Posted: 27 Feb 2006, 17:44:38 UTC

BOINC 4.45 has at least 2 bugs which cause problems with this project, so it\'s not recommended.
Follow the advice of MikeMarsUK and upgrade.

Then follow his other advice.

ID: 20805 · Report as offensive     Reply Quote
old_user21418

Send message
Joined: 27 Sep 04
Posts: 4
Credit: 38,854
RAC: 0
Message 20973 - Posted: 2 Mar 2006, 13:58:37 UTC

Hi,

I\'ve upgraded to Bonic 5.2.13, but I\'m still not getting anywhere.

I have no control over the anti-virus software as its a networked machine and that is done centrally.

What is Prime95?

The latest message tab reads (and seems to be the same every time):

02/03/2006 01:49:43|climateprediction.net|Unrecoverable error for result sulphur_hpno_100826404_1 (<file_xfer_error> <file_name>sulphur_hpno_100826404_1_1.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_hpno_100826404_1_2.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_hpno_100826404_1_3.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_hpno_100826404_1_4.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_hpno_100826404_1_5.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error>)


I\'m also running CPDN classic at the same time - could this be causing any problems?


Any further suggestions gratefully recieved. I never seemed to have these problems before switching to the sulphur model.

ID: 20973 · Report as offensive     Reply Quote
Profile Andrew Hingston
Volunteer moderator

Send message
Joined: 17 Aug 04
Posts: 753
Credit: 9,804,700
RAC: 0
Message 20987 - Posted: 2 Mar 2006, 23:53:46 UTC

It is a pity that you have no control over the anti virus software, as there is a possibility that is the culprit, though I agree that it is odd that the problem did not manifest itself before. There is a report of Sophos AV software crashing the program because it has detected a sequence of code it thinks wrongly is a virus, so these things can hit you out of the blue.

On Prime95, there is an explanation here. You will see it is a popular way of stress testing a computer.
ID: 20987 · Report as offensive     Reply Quote
old_user81808

Send message
Joined: 12 Jun 05
Posts: 3
Credit: 231,370
RAC: 0
Message 21017 - Posted: 3 Mar 2006, 20:08:55 UTC - in response to Message 20973.  

Hi,

I\'ve upgraded to Bonic 5.2.13, but I\'m still not getting anywhere.

I have no control over the anti-virus software as its a networked machine and that is done centrally.

What is Prime95?

The latest message tab reads (and seems to be the same every time):

02/03/2006 01:49:43|climateprediction.net|Unrecoverable error for result sulphur_hpno_100826404_1 (<file_xfer_error> <file_name>sulphur_hpno_100826404_1_1.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_hpno_100826404_1_2.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_hpno_100826404_1_3.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_hpno_100826404_1_4.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error><file_xfer_error> <file_name>sulphur_hpno_100826404_1_5.zip</file_name> <error_code>-161</error_code> <error_message></error_message></file_xfer_error>)


I\'m also running CPDN classic at the same time - could this be causing any problems?


Any further suggestions gratefully recieved. I never seemed to have these problems before switching to the sulphur model.


ID: 21017 · Report as offensive     Reply Quote
old_user81808

Send message
Joined: 12 Jun 05
Posts: 3
Credit: 231,370
RAC: 0
Message 21023 - Posted: 3 Mar 2006, 20:17:55 UTC - in response to Message 21017.  

Please excuse the last post repeating a previous one. I hit the post button accidentally.

I am experiencing the same problem. I am running both the BBC experiment and sulphur, and both models have crashed, separately. I\'m running BOINC 5.2.8 on an Athlon 64 3200 with 2g of memory. I do have control over my AV software (Norton), but I\'ve been running the climate experiments without any problems for almost a year until I started running the BBC experiment. This is the latest crash message sequence. Notice that the sulphur model crashed AFTER it had been removed from memory to restart the BBC model:

03/03/06 1:00:58 PM|climateprediction.net|Pausing result sulphur_hbtw_100808484_1 (removed from memory)
03/03/06 1:00:58 PM|BBC Climate Change Experiment|Restarting result hadcm3l_1f6u_00066796_0 using hadcm3l version 507
03/03/06 2:00:58 PM|BBC Climate Change Experiment|Pausing result hadcm3l_1f6u_00066796_0 (removed from memory)
03/03/06 2:00:59 PM||request_reschedule_cpus: process exited
03/03/06 3:00:59 PM|climateprediction.net|Pausing result sulphur_hbtw_100808484_1 (removed from memory)
03/03/06 3:00:59 PM|BBC Climate Change Experiment|Restarting result hadcm3l_1f6u_00066796_0 using hadcm3l version 507
03/03/06 3:08:00 PM|climateprediction.net|Unrecoverable error for result sulphur_hbtw_100808484_1 ( - exit code -1073741819 (0xc0000005))
03/03/06 3:08:00 PM||request_reschedule_cpus: process exited
03/03/06 3:08:00 PM|climateprediction.net|Computation for result sulphur_hbtw_100808484_1 finished
03/03/06 3:09:01 PM|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
03/03/06 3:09:01 PM|climateprediction.net|Reason: To fetch work
03/03/06 3:09:01 PM|climateprediction.net|Requesting 8640 seconds of new work, and reporting 1 results
03/03/06 3:09:05 PM|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
03/03/06 3:09:07 PM|climateprediction.net|Started download of sulphur_j3e0_100890856.zip
03/03/06 3:09:09 PM|climateprediction.net|Finished download of sulphur_j3e0_100890856.zip
03/03/06 3:09:09 PM|climateprediction.net|Throughput 20848 bytes/sec
03/03/06 3:09:10 PM||request_reschedule_cpus: files downloaded


ID: 21023 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 21027 - Posted: 3 Mar 2006, 22:19:35 UTC

> Unrecoverable error for result sulphur_hbtw_100808484_1 ( - exit code -1073741819 (0xc0000005))

Note the exit code. It gets talked about a lot, with no definitive cure. However some people have fixed it with an update to their graphics card drivers.

Making a guess: You didn\'t look at the vis much with just sulphur, but have looked often with the new BBC model?

Leaving in memory has also helped a few people, and is usually recommended.

ID: 21027 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 21038 - Posted: 4 Mar 2006, 10:13:38 UTC - in response to Message 21027.  
Last modified: 4 Mar 2006, 10:14:04 UTC

> Unrecoverable error for result sulphur_hbtw_100808484_1 ( - exit code -1073741819 (0xc0000005))

Note the exit code. It gets talked about a lot, with no definitive cure. However some people have fixed it with an update to their graphics card drivers.
...

Leaving in memory has also helped a few people, and is usually recommended.



In addition to the above suggestions, here are some other ideas for reducing crashes :

I would recommend the following :

* As suggested earlier in the thread, if you use Norton or Sophos antivirus, exclude the boinc project directory from the automated scan.

* Before playing games etc, set \'no more work\' and \'suspend\' the model - that way they won\'t tread on each other\'s toes. Sometimes simultaneous use of graphics drivers from two different programs seems to cause problems.

* Run a stability test on your machine, I recommend Prime95 from www.mersenne.org (the \'torture test\' option). Run it for about 24 hours. If this fails, then there may be problems with overheating / overclocking etc, cleaning out dust from the motherboard and fans often helps if this is the case.
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 21038 · Report as offensive     Reply Quote
old_user81808

Send message
Joined: 12 Jun 05
Posts: 3
Credit: 231,370
RAC: 0
Message 21119 - Posted: 7 Mar 2006, 8:58:16 UTC - in response to Message 21038.  

> Unrecoverable error for result sulphur_hbtw_100808484_1 ( - exit code -1073741819 (0xc0000005))

Note the exit code. It gets talked about a lot, with no definitive cure. However some people have fixed it with an update to their graphics card drivers.
...

Leaving in memory has also helped a few people, and is usually recommended.



In addition to the above suggestions, here are some other ideas for reducing crashes :

I would recommend the following :

* As suggested earlier in the thread, if you use Norton or Sophos antivirus, exclude the boinc project directory from the automated scan.


Thanks! Definitely linked to graphics. I updated the graphics driver; also stopped using BOINC screensaver. No more crashes. (I have an ATI X700 PCI-X card on XP)

* Before playing games etc, set \'no more work\' and \'suspend\' the model - that way they won\'t tread on each other\'s toes. Sometimes simultaneous use of graphics drivers from two different programs seems to cause problems.

* Run a stability test on your machine, I recommend Prime95 from www.mersenne.org (the \'torture test\' option). Run it for about 24 hours. If this fails, then there may be problems with overheating / overclocking etc, cleaning out dust from the motherboard and fans often helps if this is the case.


ID: 21119 · Report as offensive     Reply Quote

Questions and Answers : Windows : Sulphur Cycle keeps crashing

©2024 climateprediction.net