climateprediction.net home page
Intel I7 Woes....No successful completion since April 2015

Intel I7 Woes....No successful completion since April 2015

Message boards : Number crunching : Intel I7 Woes....No successful completion since April 2015
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 54313 - Posted: 15 Jun 2016, 7:14:30 UTC - in response to Message 54312.  

Art

If you do change the ram, take out all of the others as well, and then replace then. Perhaps also give the sockets a blow to make sure there's no dust in them.
Had this happen at work decades ago. Just some dust.

ID: 54313 · Report as offensive     Reply Quote
john

Send message
Joined: 20 May 14
Posts: 13
Credit: 7,586,474
RAC: 0
Message 54317 - Posted: 15 Jun 2016, 11:20:56 UTC - in response to Message 54312.  
Last modified: 15 Jun 2016, 11:22:30 UTC

I would swap out the two gig RAM stick and the 4 gig RAM stick that it is paired with. No need to take out all four sticks of RAM (except to clean out the dust, which is a good idea). I would then replace those 2 sticks of RAM with 2 four-gig sticks of the same brand and specifications as the two that remain in the machine.

I am utterly baffled as to why HP would put in 3 4-gig RAM sticks and 1 2-gig stick.

I am sure there was some subtle software change somewhere that is failing due to your RAM configuration and is carrying over into your CDPN stuff.
ID: 54317 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 54318 - Posted: 15 Jun 2016, 13:23:41 UTC - in response to Message 54317.  

I think john has done some very good investigative work and has given some very good advice. Memory should be replaced in pairs, and the mis-match that HP inflicted on you was bound to lead to trouble, and CPDN finally found it.
ID: 54318 · Report as offensive     Reply Quote
Art Masson
Avatar

Send message
Joined: 16 Oct 11
Posts: 254
Credit: 15,954,577
RAC: 2
Message 54319 - Posted: 15 Jun 2016, 14:35:47 UTC - in response to Message 54318.  

Thanks everyone. I will order and replace all four sticks with new 4GB sticks to bring me to 16GB and eliminate the unbalanced memory pair of sticks. No idea why HP configured the machine this way to begin with. Still a bit of a mystery on why problems with CPDN WUs only started on this machine in April 2015 after more than two years of processing, however if this cures the problem I won't care!

As another minor point, I have four other machines processing CPDN files (though none of these are Intel I7 machines). They all process BOINC at 100% of CPU capacity with no problems, so it always seemed strange that I can't push my Intel I7 machine the same way and seem to have to run it at 75% load...perhaps the memory problems only occur when pushing the machine.

Anyway...thanks and a "tip of the hat" to John, Jim, and Les for your advice...sorry to continue this thread for so long trying to figure out what's going on. My original thinking was that it was some interaction between CPDN and other projects that BOINC couldn't handle or was specific to the Intel I7 platform. After all this it seems much more likely that this is unique to my machine/configuration.

Sincerely,
Art
ID: 54319 · Report as offensive     Reply Quote
Art Masson
Avatar

Send message
Joined: 16 Oct 11
Posts: 254
Credit: 15,954,577
RAC: 2
Message 54320 - Posted: 15 Jun 2016, 14:50:06 UTC - in response to Message 54313.  

Les -- If you don't mind, one more related question for you.

As a result of all the failures over the last year or so, I now have quite a number of WU's that CPDN believes are "in process", but are no longer visible to me in BOINC. Is there any way for me to "release" these so CPDN can reassign them to others? Or must I just wait for the year to go by before CPDN realizes I will never process them and reassigns them?

I will give everyone an update after I get the new 4GB memory chips in place.

Art
ID: 54320 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 54321 - Posted: 15 Jun 2016, 15:11:31 UTC - in response to Message 54319.  

While I think of it, one last piece of advice. If the new memory modules have faster than standard timings, just stick with the standard timings (e.g, 9-9-9-24). That is what the motherboard will normally set by default, and you shouldn't have to do anything. The advanced timings are actually only guaranteed for a single pair, but when you are using two pairs (four modules), they do not apply and can lead to trouble. I have been there and done that.
ID: 54321 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4342
Credit: 16,497,933
RAC: 6,477
Message 54322 - Posted: 15 Jun 2016, 15:32:27 UTC - in response to Message 54320.  

I now have quite a number of WU's that CPDN believes are "in process", but are no longer visible to me in BOINC.


I think that if you let all work on the computer finish, then detach and reattach to the project that should tell CPDN that they are free.
ID: 54322 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,478,808
RAC: 4,045
Message 54323 - Posted: 15 Jun 2016, 15:45:03 UTC - in response to Message 54312.  

I have to agree with others in this thread. This is an extremely unusual memory configuration and I can hardly believe HP did it.

While memtest does an alright job of testing memory, prime95 running 8 threads and using the test that tests "lots of memory" should expose if there is something really wrong with the system. prime95 running 8 threads stresses the CPU more than 8 cpdn models, and the "lots of memory" test should also stress the memory pretty good. cpdn is much more demanding of memory than most boinc projects, and if there's anything dodgy throughout the memory system (cpu memory controller, motherboard circuitry/slots, or memory), then problems will result. Since it seems to only fail when running a lot of models, it may be the processor memory controller that is overstressed. Prime95 should confirm that. If it can run that stable for 24 hours, I would think your hardware should be in good shape. Still, why that memory configuration?

ID: 54323 · Report as offensive     Reply Quote
Alex Plantema

Send message
Joined: 3 Sep 04
Posts: 126
Credit: 26,363,193
RAC: 0
Message 54324 - Posted: 15 Jun 2016, 18:53:34 UTC - in response to Message 54312.  

The current configuration has the first bank with two 4GB chips, while the second bank has a 2GB and a 4GB chip

If slots with the same colour don't contain identical memory modules, the system cannot run in dual-channel mode, making it slower than necessary.
ID: 54324 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 54325 - Posted: 15 Jun 2016, 20:58:06 UTC

Hi Art

Keep posting as long as you like. Can't wait for the next episode. :)

Yes, I read decades ago that ram sticks should be matching for dual mode, and I've always bought them as a set at the time of building, because if you leave it for a year or so to get some more, you'll probably find that the technology has moved on, and it's not possible to get more of what you have.

As for the "dead" models, it's not too much of a worry, because the researchers keep track of what's happening, and can always issue a new batch to replace ones that are taking too long.
Which is why, while data can be returned long after the BOINC deadline, the results will probably be useless - the researcher will have what he/she needs from other people.

ID: 54325 · Report as offensive     Reply Quote
john

Send message
Joined: 20 May 14
Posts: 13
Credit: 7,586,474
RAC: 0
Message 54326 - Posted: 15 Jun 2016, 23:31:45 UTC - in response to Message 54319.  

Just for shits and grins: Before you replace all four RAM sticks, take out the 2-gig stick and the 4-gig stick it is paired with. Then try to run the I7 at full load.

You may have some delay when you try to come in with some other task, but this is a test. See if you have errors running on 8 gig of RAM where the RAM is matched.

FYI -- recommendation is that one has at least 1.5 - 2 gig of RAM per core. But as the old saying goes, you're never too thin, too rich, or have too much RAM.

Unless it's unbalanced, mis-matched RAM that is.

You may be better with 8 gig of matched RAM than 12 of mis-matched RAM.
ID: 54326 · Report as offensive     Reply Quote
john

Send message
Joined: 20 May 14
Posts: 13
Credit: 7,586,474
RAC: 0
Message 54327 - Posted: 15 Jun 2016, 23:35:23 UTC - in response to Message 54321.  

The computer should make the faster sticks throttle back to the speed of the slower sticks automatically. No settings adjustments needed.


Just make sure that any time you open your machine you have absolutely no static.
ID: 54327 · Report as offensive     Reply Quote
Art Masson
Avatar

Send message
Joined: 16 Oct 11
Posts: 254
Credit: 15,954,577
RAC: 2
Message 54328 - Posted: 16 Jun 2016, 2:38:04 UTC - in response to Message 54327.  

I splurged and just ordered 4 matched Crucial 8GB Single DDR3 1600 sticks. This will max out my memory at 32GB and also solve the mis-matched RAM problem.

Also just also figured out how this memory situation likely happened. Checked my records and I bought this machine refurnished....I'll bet HP (or their refurbishing partner/contractor) "cheaped out" and replaced a bad 4GB chip with a cheaper 2GB one in the restoration process!!!

Still no idea why things ran well for years before failing...

Art
ID: 54328 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 54330 - Posted: 16 Jun 2016, 10:34:40 UTC - in response to Message 54328.  

Still no idea why things ran well for years before failing...

It is because that address range was not used before. Memory controllers start at one end of the range ("high" or "low" as the case may be), and work towards the other as the memory fills up. You never had so much work before.
ID: 54330 · Report as offensive     Reply Quote
john

Send message
Joined: 20 May 14
Posts: 13
Credit: 7,586,474
RAC: 0
Message 54331 - Posted: 16 Jun 2016, 12:15:11 UTC - in response to Message 54328.  

Try removing the six gig of mis-matched RAM and see what happens.
ID: 54331 · Report as offensive     Reply Quote
Art Masson
Avatar

Send message
Joined: 16 Oct 11
Posts: 254
Credit: 15,954,577
RAC: 2
Message 54332 - Posted: 16 Jun 2016, 20:27:46 UTC - in response to Message 54331.  

Thanks John,

At this point I'm just going to install the 4 new 8GB sticks that I bought when I get them Friday or Monday -- and hopefully my problems will disappear! Taking bets!

Will keep you advised in future posts! Obviously if the failures continue to occur with the new sticks, I'm back to go...but hopefully that doesn't happen!

Art
ID: 54332 · Report as offensive     Reply Quote
Art Masson
Avatar

Send message
Joined: 16 Oct 11
Posts: 254
Credit: 15,954,577
RAC: 2
Message 54336 - Posted: 18 Jun 2016, 16:50:27 UTC - in response to Message 54332.  

OK....Now running with the new RAM. 32GB installed....unfortunately I didn't read the fine print and with Windows 7 Premium only 16GB is usable (grrrr....) I might upgrade to Windows 7 professional, but for now at least this should solve the mis-matched memory stick problem.

Here is my computer info:
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1266353

Time will tell....running with only 6CPUs for today to verify all is OK, then will move to all 8 CPUs again and see what happens!
ID: 54336 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,478,808
RAC: 4,045
Message 54338 - Posted: 18 Jun 2016, 17:11:12 UTC - in response to Message 54336.  

OK....Now running with the new RAM. 32GB installed....unfortunately I didn't read the fine print and with Windows 7 Premium only 16GB is usable (grrrr....) I might upgrade to Windows 7 professional, but for now at least this should solve the mis-matched memory stick problem.

Wow, that is a completely artificially imposed memory limitation by Microsoft. Just something else to get people to buy the higher priced versions.
ID: 54338 · Report as offensive     Reply Quote
john

Send message
Joined: 20 May 14
Posts: 13
Credit: 7,586,474
RAC: 0
Message 54340 - Posted: 18 Jun 2016, 22:55:52 UTC - in response to Message 54336.  

"OK....Now running with the new RAM. 32GB installed....unfortunately I didn't read the fine print and with Windows 7 Premium only 16GB is usable"

Whaddi say? huh? HUh? Tol' ya to take out the six gig of mismatched RAM and put in 8 gig of matched RAM! "Art!" I sez, "take out the 2-gig stick and the 4-gig stick it is paired with." But did Art listen to me? Noooooo. Can't listen to john, HE isn't worth listening to. Oh sure, they pay LIP service to me, "Tip o' the hat" he sez, but actually LISTENING to me? Can't be bothered, canne now?

But nooooooooooooo, mister "I splurged" (that's wha ya do in a loo innit?) he sez. Throwing his money around and rubbing it in our faces, I say. Well, serves him right. Carrying on like that Bet he threw out the mismatched RAM too! "Na good enuf" I can hear his say! "Out with it"!

Oh Lord. I. thy humble servant.

ID: 54340 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 54342 - Posted: 19 Jun 2016, 10:07:50 UTC - in response to Message 54338.  

Wow, that is a completely artificially imposed memory limitation by Microsoft. Just something else to get people to buy the higher priced versions.
How else can MS pay for Windows 10? But while we are on that awful subject, there is one advantage: the corresponding Win 10 version removes that limit, so you could upgrade for free and solve that problem. I did that on my laptop; there were several error messages that I had to work through, but after chasing them all down and doing the appropriate registry fixes and whatever other incantations were necessary, it all works now, for a laptop.

I avoid Win10 on my dedicated PCs where I do the crunching for a variety of reasons, but the main one is that you lose control over driver upgrades. MS pushes them to you whether you want them or not. But you can get a Windows 7 install/repair disk that includes the Pro version with no license, for $10. Then, if you shop around, you can get a Win7 64-bit OEM Pro license for only $30. I would post the links here, but I don't know if that is allowed.


ID: 54342 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Intel I7 Woes....No successful completion since April 2015

©2024 climateprediction.net