climateprediction.net home page
CPU clock and percentage not counting, but trickle been sent

CPU clock and percentage not counting, but trickle been sent

Message boards : Number crunching : CPU clock and percentage not counting, but trickle been sent
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Saenger
Avatar

Send message
Joined: 1 Nov 04
Posts: 185
Credit: 4,154,204
RAC: 1,528
Message 34566 - Posted: 8 Aug 2008, 20:53:55 UTC

I\'ve already posted it in the other forum, I don\'t know which one is better frequented as usually I don\'t have so much questions any more ;) Here\'s what has happened:

I first thought my WU was stuck in a loop, as was mentioned in other threads. It uses one full core of my quad, but neither the CPU-time nor the percentage moved anyhow in the last few hours, they are stuck at 81:09:25 and 66.666%.

Then all of a sudden a trickle was send an hour ago, and a look at my trickles revealed: the last one was sent after 84:27:26h, and it\'s #2 of the 3rd phase of this HADSM. There was even one before that at 82:48:19h that I didn\'t find first.

I\'m a bit calmer now in regard of wasted CPU-time through looping, but I still don\'t like the stuck clock. Is there anything I can do about it?
Grüße vom Sänger
ID: 34566 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 34567 - Posted: 8 Aug 2008, 21:38:57 UTC

Keeping in mind that one should NEVER interrupt a slab model once it has started it\'s end-of-phase activity until it has gone past the first checkpoint in the new phase, Exiting from BOINC, and re-starting BOINC, will sort out at least 2 problems:

1) The \'blobby\' text on the globe display
2) The failure of the gui to pick up changes to the model\'s percentage. (And perhaps other un-updated info.)

ID: 34567 · Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 1 Nov 04
Posts: 185
Credit: 4,154,204
RAC: 1,528
Message 34568 - Posted: 8 Aug 2008, 21:48:23 UTC - in response to Message 34567.  

Keeping in mind that one should NEVER interrupt a slab model once it has started it\'s end-of-phase activity until it has gone past the first checkpoint in the new phase, Exiting from BOINC, and re-starting BOINC, will sort out at least 2 problems:

1) The \'blobby\' text on the globe display
2) The failure of the gui to pick up changes to the model\'s percentage. (And perhaps other un-updated info.)

If anything has interupted the model it was BOINC, the puter (and so BOINC) was not turned off since Wednesday morning.

I next to never look at the graphics, I don\'t even know how to install it as a screen saver under Linux, so I don\'t know nothing about anything \'blobby\'.

I\'ll give the restart a try.
Grüße vom Sänger
ID: 34568 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 34569 - Posted: 8 Aug 2008, 22:05:08 UTC

The part about not interrupting the models was for anyone reading this thread, who hasn\'t read the warnings in the News and Announcements thread, not to do this themselves.
Interrupting a slab model at this time causes the model to re-do from the beginning, wasting time.

The globe display is available by clicking on the Show graphics button in the Tasks tab of the gui.

ID: 34569 · Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 1 Nov 04
Posts: 185
Credit: 4,154,204
RAC: 1,528
Message 34572 - Posted: 9 Aug 2008, 5:02:18 UTC - in response to Message 34568.  

I\'ll give the restart a try.

I did, I slept a while, and now the clock is running again.
Only it has started where it was stuck. This will be either a very good Credits/h ratio, if the 5.5h are just not counted, or I will have more trickles as usual, let\'s wait and see, in probably about a week or so it should be finished.

As you can see in the link in my first post the last trickle has a lower number in CPU Time (sec) than the one before (#51: 310017sec, #52: 296924sec). It\'s still running, I don\'t see any obvious wrong with the WU on my puter besides the short time, so I hope for the best.

Thanks anyway.
Grüße vom Sänger
ID: 34572 · Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 1 Nov 04
Posts: 185
Credit: 4,154,204
RAC: 1,528
Message 34644 - Posted: 15 Aug 2008, 15:40:11 UTC

It finished fine, it has just a bit too small value for Average (sec/TS): 0.5372, it should be something like 0.5617.
Grüße vom Sänger
ID: 34644 · Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 34677 - Posted: 18 Aug 2008, 15:11:11 UTC - in response to Message 34644.  

It finished fine, it has just a bit too small value for Average (sec/TS): 0.5372, it should be something like 0.5617.


I\'ve got an E8400 @ 3.0GHz but it doesn\'t crunch that fast!

Y\'know, it finished OK and your graphs look plausible. So, I\'d say mission accomplished.
ID: 34677 · Report as offensive     Reply Quote
old_user12561

Send message
Joined: 4 Sep 04
Posts: 20
Credit: 29,656
RAC: 0
Message 34799 - Posted: 27 Aug 2008, 17:52:38 UTC
Last modified: 27 Aug 2008, 17:54:27 UTC

Funny that this is the same point at which mine was stuck (66.666%) and sending trickles every five seconds according to BOINC. Restarted the BOINC app and she seems happy.
ID: 34799 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 34808 - Posted: 27 Aug 2008, 21:45:54 UTC - in response to Message 34799.  

[DaBrat and DaBear wrote:] Funny that this is the same point at which mine was stuck (66.666%) and sending trickles every five seconds according to BOINC. Restarted the BOINC app and she seems happy.
It\'s normal for a slab model to send lots of trickles at a phase change. The recommended action is to let it continue - they should stop after a few tens of minutes. Stopping a slab when it\'s doing this can cause the model to go right back to the beginning. You were lucky!
ID: 34808 · Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 1 Nov 04
Posts: 185
Credit: 4,154,204
RAC: 1,528
Message 34861 - Posted: 1 Sep 2008, 21:54:33 UTC - in response to Message 34677.  

I\'ve got an E8400 @ 3.0GHz but it doesn\'t crunch that fast!

Y\'know, it finished OK and your graphs look plausible. So, I\'d say mission accomplished.

My current one is @0.533 now, that\'s 58 C/h. My puter is a Q9450 running @3.2GHz under 64bit ubuntu8.04.

The HADSM run with more than double the credit rate as HADAM (26 C/h) and 50% more than HADCM (40 C/h)
Grüße vom Sänger
ID: 34861 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 34865 - Posted: 2 Sep 2008, 1:06:52 UTC
Last modified: 2 Sep 2008, 1:18:49 UTC

The Q9000 series does look very fast.

The credits are supposed to be approximately equal for all 3 types of model and on all the CPDN projects (CPDN, Beta, BBC and SAP). But some members find, for example, that HADCMs run faster on Linux, at least on some computers. I find that my Intel Core2Duo with Windows performs relatively badly with HADSMs.

So yes, it\'s sometimes possible for CPDN members to increase their credits by noticing how the different model types perform on their machines and selecting the types that earn most credits/hour. This is also good for the project.

When the BOINC6-compliant models currently being beta-tested are released on CPDN, the new model versions may perform differently on particular computers and OSs, so we\'ll need to look at their credits/hour again.
Cpdn news
ID: 34865 · Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 1 Nov 04
Posts: 185
Credit: 4,154,204
RAC: 1,528
Message 34968 - Posted: 11 Sep 2008, 15:42:05 UTC - in response to Message 34865.  

When the BOINC6-compliant models currently being beta-tested are released on CPDN, the new model versions may perform differently on particular computers and OSs, so we\'ll need to look at their credits/hour again.

Will they be marked somehow?
I\'d like to keep track of them if I got them.
Grüße vom Sänger
ID: 34968 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 34969 - Posted: 11 Sep 2008, 18:13:26 UTC
Last modified: 11 Sep 2008, 18:13:54 UTC

You can tell which application version is running in the Tasks tab of BOINC Manager. When a model finishes, the application version number is also reported on the results page for the model - though not when it\'s running.
ID: 34969 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 34971 - Posted: 11 Sep 2008, 19:44:56 UTC
Last modified: 11 Sep 2008, 19:46:06 UTC

All the model/application versions called 6.** are BOINC6-compliant. The ones called 5.** are not. But they all run on both BOINC5 and BOINC6 with a problematic exception that I think we\'ll soon be announcing in the News thread, top of Number Crunching.

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/apps.php
Cpdn news
ID: 34971 · Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 1 Nov 04
Posts: 185
Credit: 4,154,204
RAC: 1,528
Message 35256 - Posted: 15 Oct 2008, 15:44:32 UTC

The current MidHol seem to be even better than the usual HADSM. The HADSMs average about something around 58 c/h (0.53 s/TS) they are above 60 c/h (Current has 0.52 s/TS)

If you crunch CPDN on a 64bit Linux system on an Intel Quad, choose them ;)
If you run more than one in parallel the performance goes down significantly btw.
Grüße vom Sänger
ID: 35256 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 35261 - Posted: 15 Oct 2008, 22:52:59 UTC

I\'m sure your very nice computer is one reason for the fast crunching.....

Some people have found that HADCM models run faster on Linux than Windows but I don\'t know whether anyone has compared HADSM or HADSM MH models in this way.

Other members recommend running different model types side by side. Too many HADAMs on one machine can definitely slow each other down when they compete for memory.
Cpdn news
ID: 35261 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2169
Credit: 64,555,907
RAC: 5,858
Message 35264 - Posted: 15 Oct 2008, 23:18:31 UTC - in response to Message 35261.  

Some people have found that HADCM models run faster on Linux than Windows but I don\'t know whether anyone has compared HADSM or HADSM MH models in this way.

That was the way it was for quite awhile. Now that the hadcm3 6.0x app has been released here, the linux performance on that app has degraded significantly, so that it is now considerably slower than Windows.
ID: 35264 · Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 1 Nov 04
Posts: 185
Credit: 4,154,204
RAC: 1,528
Message 35267 - Posted: 16 Oct 2008, 4:45:53 UTC - in response to Message 35261.  

I\'m sure your very nice computer is one reason for the fast crunching.....

That may be one part of it, but...
This is not about CPDN vs. other projects but HADSM vs. HADAM vs HADCM vs MidHol on the same machine. A look at the current applications shows me, I have to test one of all again, the apps are newer than my last test, except the one for HADSM-Lin64, which is still 5.10.
Grüße vom Sänger
ID: 35267 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 35274 - Posted: 16 Oct 2008, 11:25:47 UTC

Tolu\'s now made nearly all the models BOINC6-compliant and in some cases has added other changes as well.
Cpdn news
ID: 35274 · Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 1 Nov 04
Posts: 185
Credit: 4,154,204
RAC: 1,528
Message 35276 - Posted: 16 Oct 2008, 16:19:12 UTC - in response to Message 35274.  

Tolu\'s now made nearly all the models BOINC6-compliant and in some cases has added other changes as well.

Almost ;)
I\'ll give the other ones a try.
Grüße vom Sänger
ID: 35276 · Report as offensive     Reply Quote

Message boards : Number crunching : CPU clock and percentage not counting, but trickle been sent

©2024 climateprediction.net