climateprediction.net home page
Stuck at 66.666%

Stuck at 66.666%

Questions and Answers : Windows : Stuck at 66.666%
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile MacDitch
Avatar

Send message
Joined: 2 May 06
Posts: 17
Credit: 505,526
RAC: 0
Message 30601 - Posted: 21 Sep 2007, 11:33:49 UTC

Hi. I\'ve got a model running offline (one of two on the machine) that was fine until it got to the end of the second phase. When it got there it did something (don\'t know what exactly) and then stopped. It did not produce a zip file or request internet access, just stopped running whilst still counting up the running time.

I\'ve tried restarting Bonic (several times now) but each time it shows activity in the task manager for a few minutes and then stops again.

Can anyone suggest a way to jump start this model? My most recent back up is several days old and would (obviously) cause me to re-work a chunk of the \'good\' model as well; which I would rather avoid.

For what it\'s worth:
Op. Sys. - Windows XP Pro
Boinc Ver. - 5.4.9 (I know it\'s out of date!)
Application - hadsm3 5.06


The Scottish BOINC Team Forum
ID: 30601 · Report as offensive     Reply Quote
Profile MacDitch
Avatar

Send message
Joined: 2 May 06
Posts: 17
Credit: 505,526
RAC: 0
Message 30604 - Posted: 21 Sep 2007, 12:06:49 UTC

Update:

Have also now done a full system reboot - with no visible benefit.

The \'something\' that I said the model does on restart is \"post processing\"; which it does for nearly five minutes. After this time it does nothing at all, and no graphic is available.

Any help would be greatly appreciated.

The Scottish BOINC Team Forum
ID: 30604 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 30608 - Posted: 21 Sep 2007, 15:25:33 UTC

These slab models are throwing up a variety of problems: looping, going slow, going fast, re-doing phases and stopping at the post-processing.

If your backup doesn\'t work then it\'s a gonner. It would be interesting to know whether the backup does save the model, since that would mean that others could be encouraged to try a restore - assuming they have a backup, of course.
ID: 30608 · Report as offensive     Reply Quote
Profile MacDitch
Avatar

Send message
Joined: 2 May 06
Posts: 17
Credit: 505,526
RAC: 0
Message 30630 - Posted: 22 Sep 2007, 2:36:22 UTC

Ok. I\'ll play with the back-up and see what I can do. Will let you know if it works!

The Scottish BOINC Team Forum
ID: 30630 · Report as offensive     Reply Quote
Profile MacDitch
Avatar

Send message
Joined: 2 May 06
Posts: 17
Credit: 505,526
RAC: 0
Message 30707 - Posted: 26 Sep 2007, 4:37:40 UTC
Last modified: 26 Sep 2007, 4:38:32 UTC

I\'ve now rammed the 66.666% mark (end of 2nd phase) three or four times with no success. I guess the model is officially kaput. :(

So, in light of this fact, can anyone advise a \'graceful\' way of crashing the model or should I just abort it?

Cheers,
MacDitch
ID: 30707 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 30710 - Posted: 26 Sep 2007, 7:52:30 UTC


Just aborting it is the best way to do it. There isn\'t any way to enter a reason for the model being stopped, which is a pity.
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 30710 · Report as offensive     Reply Quote
Nam

Send message
Joined: 31 Aug 04
Posts: 1
Credit: 213,930
RAC: 0
Message 30714 - Posted: 26 Sep 2007, 10:18:39 UTC

Just for the records my last result ID went in a loop at the end of phase 1; now stopping it after 826101 secs (the double of the reported CPU time from the last trickle). :(

Hint: a loop of this type should be discovered since the client sent regular trickle-up message (don\'t know their content).
ID: 30714 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 30716 - Posted: 26 Sep 2007, 10:48:33 UTC - in response to Message 30714.  

Just for the records my last result ID went in a loop at the end of phase 1; now stopping it after 826101 secs (the double of the reported CPU time from the last trickle). :(

Hint: a loop of this type should be discovered since the client sent regular trickle-up message (don\'t know their content).


That\'s the right thing to do. At least the model that you had submitted its phase-end Zip file, since the temperature and precipitation graphs are visible.

The detection of loopers from trickle records has been suggested before and the implementation of such a scheme appears to be difficult. For example, repeated trickles from restored backups shouldn\'t cause a model to be stopped. However, it seems possible in principle, since there is no valid reason for the same trickle to be submitted, for example, 10 times. I think the problem is a practical one, in that it would be an essentially administrative task (i.e. run a database report, send out \'killer trickle\' messages etc.) and there isn\'t any administrative effort available. The project\'s approach has therefore been to take note of problems like this and update the science application. This happened with the coupled model, which doesn\'t loop any more. I don\'t know whether the slab model branch of the software (5.06) is being developed - I suspect not.
ID: 30716 · Report as offensive     Reply Quote

Questions and Answers : Windows : Stuck at 66.666%

©2024 climateprediction.net