climateprediction.net home page
CPDN locks up computer shortly after starting work

CPDN locks up computer shortly after starting work

Questions and Answers : Windows : CPDN locks up computer shortly after starting work
Message board moderation

To post messages, you must log in.

AuthorMessage
John

Send message
Joined: 7 Dec 05
Posts: 13
Credit: 5,678,097
RAC: 0
Message 21768 - Posted: 1 Apr 2006, 5:11:38 UTC

I have processed other work units, but this one is particularly touchy (hadcm3lb_59xj_05034131_0 and another one like it with the same first 5 characters).
Settings involve... no screen saver and process left in memory when switching. I can process any of 4 other projects (Seti, Boinc, Einstein and World Grid) with no problems. I have reset this project and do not plan to participate again until this is fixed. I understand that some processes will abort, but at least that allows someone elses to continue. Locking up is often not discovered until the next morning. Later. JB
ID: 21768 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 21776 - Posted: 1 Apr 2006, 13:42:08 UTC

Have you read through the top-most \'sticky\' in this forum?
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 21776 · Report as offensive     Reply Quote
John

Send message
Joined: 7 Dec 05
Posts: 13
Credit: 5,678,097
RAC: 0
Message 21789 - Posted: 1 Apr 2006, 17:02:23 UTC - in response to Message 21776.  

Have you read through the top-most \'sticky\' in this forum?

Yes, It does not appear to apply. For one there is no Windows send/do not send dialogue. The system locks up and there is either a black screen or the screen shows the last thing displayed, but not even the cursor will move.
ID: 21789 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 21792 - Posted: 1 Apr 2006, 18:34:16 UTC

If you could list which things you tried, and which you didn\'t, that would help. In particular, did you try \'Prime95\'?
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 21792 · Report as offensive     Reply Quote
John

Send message
Joined: 7 Dec 05
Posts: 13
Credit: 5,678,097
RAC: 0
Message 21866 - Posted: 4 Apr 2006, 3:35:52 UTC - in response to Message 21792.  

If you could list which things you tried, and which you didn\'t, that would help. In particular, did you try \'Prime95\'?

ID: 21866 · Report as offensive     Reply Quote
John

Send message
Joined: 7 Dec 05
Posts: 13
Credit: 5,678,097
RAC: 0
Message 21867 - Posted: 4 Apr 2006, 3:52:09 UTC - in response to Message 21792.  

If you could list which things you tried, and which you didn\'t, that would help. In particular, did you try \'Prime95\'?


Did not try Prime95. It is not over-clocked. I have quality memory. Have run other
CPDN work units up to about 100 cpu hours or so when by that time they crashed do to compution errors. At the last crash I did a reset and downloaded a new work unit which when started crashed within 5-10 seconds. No overheating here. Reset the project and downloaded a new workunit which when started locked up with a black screen (no info) within 5-10 seconds.
I suppose that there could still be a problem with my computer, but it would be very difficult to isolate and I am still happily processing those other projects which by the way load up the cpu as well. I play some games regularly and there is no problem with any of them.
In summary I can still process other projects and play games. I am not looking to spend hours isolating an error, IF it is on my computer.
I have run various benchmarks on all my computers and they all have completed successfully. Another computer is running CPDN (now up to about 500 hours) so I guess that will be my contribution to the project. Thanks for your interest.
ID: 21867 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 21868 - Posted: 4 Apr 2006, 6:32:02 UTC

It makes it hard to diagnose problems if we have no idea whether it\'s hardware or software, Prime95 is the easiest way to check whether it\'s one or the other. I wasn\'t recommending Prime95 because of overclocking, I was suggesting it in order to confirm that your machine\'s hardware was stable.

What are your actual CPU temperatures? Keep in mind that due to spring the temperatures are rising quite quickly, so we\'d be expecting to see some machines which were running well to start to struggle (which matches the pattern that you\'re experiencing).

If Prime95 is OK and the CPU temps are OK, then we\'d need to look at the software side of things.
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 21868 · Report as offensive     Reply Quote
John

Send message
Joined: 7 Dec 05
Posts: 13
Credit: 5,678,097
RAC: 0
Message 21949 - Posted: 8 Apr 2006, 8:50:30 UTC - in response to Message 21868.  

It makes it hard to diagnose problems if we have no idea whether it\'s hardware or software, Prime95 is the easiest way to check whether it\'s one or the other. I wasn\'t recommending Prime95 because of overclocking, I was suggesting it in order to confirm that your machine\'s hardware was stable.

What are your actual CPU temperatures? Keep in mind that due to spring the temperatures are rising quite quickly, so we\'d be expecting to see some machines which were running well to start to struggle (which matches the pattern that you\'re experiencing).

If Prime95 is OK and the CPU temps are OK, then we\'d need to look at the software side of things.


The motherboard temp is 88 F and the CPU is 111 F when running the projects. So temp is not the issue and since I have run previous CPDN for up to 100 hours I do not believe that it is a hardware error. This suddenly occurred with this new series of CPDN projects and as I previously said I am NOT looking to spend hours diagnosing the problem. Therefore I will NOT be running another benchmark pgm. Again though thanks for your interest, but I am content to run other projects instead. Thanks. JB
ID: 21949 · Report as offensive     Reply Quote
Nobody

Send message
Joined: 2 Dec 05
Posts: 6
Credit: 4,525,553
RAC: 0
Message 21969 - Posted: 10 Apr 2006, 9:03:03 UTC - in response to Message 21789.  


I encountered this issue when my my client started trying to crunch the \'hadxxx...\' models on the ClimatePrediction project, having stayed up for weeks on end continuously crunching the sulphur cycle model with no problems at all.

I\'ve got to wonder whether its an AMD/Intel issue - I\'ve got the same model running here on my work machine which is a dual Intel 2.8GHz, while my home machine was a dual AMD 2800+ (which benchmarked significantly greater crunching power than the \'equivalent\' Intel). I ended up having to rebuild my home machine which is now running a new M/B with an AMD 3200+, but still freezes running this new model. Neither machine were overclocked and they\'ve both got 1GB of memory.

After several unsuccessful resets of the project, detaching and reattaching and deleting the failed jobs, I\'ve suspended the climate prediction model at home on the AMD machine, and it\'s happily crunching World Community Grid work packets instead.

Have you read through the top-most \'sticky\' in this forum?

Yes, It does not appear to apply. For one there is no Windows send/do not send dialogue. The system locks up and there is either a black screen or the screen shows the last thing displayed, but not even the cursor will move.


ID: 21969 · Report as offensive     Reply Quote
John

Send message
Joined: 7 Dec 05
Posts: 13
Credit: 5,678,097
RAC: 0
Message 22006 - Posted: 13 Apr 2006, 0:26:52 UTC - in response to Message 21969.  

It is possible that in my case this is a Windows issue (registry, duplicate DLL\'s etc) as I have noticed on occasion that when you move between \"Work\", \"transfer\", and \"Messages\" that the window will stick for a moment and then you have to click again to get it to update. Mine is also an AMD (2600) but so is the computer that has about 500 hours of processing complete on another work unit (not this series). Someday I may have to reload (install) Windows to clear out some of the old crap that still hangs around from uninstalls.


I encountered this issue when my my client started trying to crunch the \'hadxxx...\' models on the ClimatePrediction project, having stayed up for weeks on end continuously crunching the sulphur cycle model with no problems at all.

I\'ve got to wonder whether its an AMD/Intel issue - I\'ve got the same model running here on my work machine which is a dual Intel 2.8GHz, while my home machine was a dual AMD 2800+ (which benchmarked significantly greater crunching power than the \'equivalent\' Intel). I ended up having to rebuild my home machine which is now running a new M/B with an AMD 3200+, but still freezes running this new model. Neither machine were overclocked and they\'ve both got 1GB of memory.

After several unsuccessful resets of the project, detaching and reattaching and deleting the failed jobs, I\'ve suspended the climate prediction model at home on the AMD machine, and it\'s happily crunching World Community Grid work packets instead.

Have you read through the top-most \'sticky\' in this forum?

Yes, It does not appear to apply. For one there is no Windows send/do not send dialogue. The system locks up and there is either a black screen or the screen shows the last thing displayed, but not even the cursor will move.



ID: 22006 · Report as offensive     Reply Quote
Nobody

Send message
Joined: 2 Dec 05
Posts: 6
Credit: 4,525,553
RAC: 0
Message 23132 - Posted: 13 Jun 2006, 7:50:23 UTC - in response to Message 21969.  


Hmmmm... For anyone who like me dismissed any possibility of issues with their hardware platform ...

I ran Prime95 as recommended in the \'sticky\' post, and it almost immediately barfed with a calculation error, and did so with every attempt at a proper stress test. Turns out that I\'ve got hardware issues. Further diagnosis shows that my CPU is running slightly hot (51degC under load), and on advice I found that the voltages set in my BIOS - although set by SPD - were slightly low for the memory I\'ve got installed (Corsiar Twin-X).

Bumping the voltage up for the memory immediately made it noticeably more stable, but not completely as it still fails Prime95 stress tests but not every time any more. I\'ve now got to address the heat issue at the CPU (Probably need to reseat after cleaning up and reapplying (less of) the thermal paste!)

Hope that helps anyone else with odd crashing problems...

Sean.


I encountered this issue when my my client started trying to crunch the \'hadxxx...\' models on the ClimatePrediction project, having stayed up for weeks on end continuously crunching the sulphur cycle model with no problems at all.

I\'ve got to wonder whether its an AMD/Intel issue - I\'ve got the same model running here on my work machine which is a dual Intel 2.8GHz, while my home machine was a dual AMD 2800+ (which benchmarked significantly greater crunching power than the \'equivalent\' Intel). I ended up having to rebuild my home machine which is now running a new M/B with an AMD 3200+, but still freezes running this new model. Neither machine were overclocked and they\'ve both got 1GB of memory.

After several unsuccessful resets of the project, detaching and reattaching and deleting the failed jobs, I\'ve suspended the climate prediction model at home on the AMD machine, and it\'s happily crunching World Community Grid work packets instead.

Have you read through the top-most \'sticky\' in this forum?

Yes, It does not appear to apply. For one there is no Windows send/do not send dialogue. The system locks up and there is either a black screen or the screen shows the last thing displayed, but not even the cursor will move.



ID: 23132 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 23133 - Posted: 13 Jun 2006, 8:01:30 UTC

Hi Sean,

Glad that you\'re on the way to sorting out your machine! :-)

It may be worth underclocking your machine slightly (reduce the clock by, say, 1 or 2%) to see if that makes a difference.
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 23133 · Report as offensive     Reply Quote
Nobody

Send message
Joined: 2 Dec 05
Posts: 6
Credit: 4,525,553
RAC: 0
Message 23137 - Posted: 13 Jun 2006, 11:20:04 UTC - in response to Message 23133.  


Yeah, its easy to jump to conclusions - HalfLife II was playing just fine and that is extermely intensive 3D processing (Particularly the Day of Defeat mod), and since previous BOINC models had been running ok the obvious thing to blame was the new models which it started trying to process. As the saying goes, \'assume\' makes an ASS out of U and ME ...

I\'ll post again once I\'ve reseated the CPU and got heat output under control. It should be running at just over 40degC. If that doesn\'t stabilise it, underclocking may be the next step, which would be a shame as the previous owner of the board had it running nicely with overclocked settings for both CPU and memory.

Sean.

Hi Sean,

Glad that you\'re on the way to sorting out your machine! :-)

It may be worth underclocking your machine slightly (reduce the clock by, say, 1 or 2%) to see if that makes a difference.


ID: 23137 · Report as offensive     Reply Quote
Profile Pooh Bear 27
Avatar

Send message
Joined: 5 Feb 05
Posts: 465
Credit: 1,914,189
RAC: 0
Message 23151 - Posted: 13 Jun 2006, 20:06:04 UTC - in response to Message 23137.  

Yeah, its easy to jump to conclusions - HalfLife II was playing just fine and that is extermely intensive 3D processing (Particularly the Day of Defeat mod), and since previous BOINC models had been running ok the obvious thing to blame was the new models which it started trying to process.

The different between HalfLife and BOINC projects is that the games run intensely of the video card. The processor itself is not pegged. BOINC processes peg your processor to near it\'s limits. In fact, many of the currently released processors could easily run BOINC projects and HalfLife simutaneously and not balk.

It\'s all perspective. Glad to see it is getting close to becoming a crunching machine, again!

ID: 23151 · Report as offensive     Reply Quote

Questions and Answers : Windows : CPDN locks up computer shortly after starting work

©2024 climateprediction.net