climateprediction.net home page
Complete, but still running.

Complete, but still running.

Message boards : Number crunching : Complete, but still running.
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile adrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,021,020
RAC: 816
Message 46047 - Posted: 25 Apr 2013, 16:04:42 UTC

My current model completed this morning some time, when I looked at about 08:00 it was 100%, remaining ---, but was still running. I've seen similar before as finished jobs write and compress their result files etc. but it is still running now some 10 hours later.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 46047 · Report as offensive     Reply Quote
Profile Byron Leigh Hatch @ team Carl ...
Avatar

Send message
Joined: 17 Aug 04
Posts: 289
Credit: 44,103,664
RAC: 0
Message 46049 - Posted: 25 Apr 2013, 18:14:18 UTC - in response to Message 46047.  

Hi adrianxw,
is this your Model your speaking about ? hadcm3n_zi2g_1880_40_008249779
I put up this link in case it might help the Forum Moderators.
ID: 46049 · Report as offensive     Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 31 Aug 04
Posts: 145
Credit: 2,021,020
RAC: 816
Message 46050 - Posted: 25 Apr 2013, 20:01:41 UTC - in response to Message 46049.  
Last modified: 25 Apr 2013, 20:03:50 UTC

Yes, that is the one. It is still running now.

<edit>
Something else I just noticed, it was sending trickles up regularly, but they seemed to stop a few days ago.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 46050 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 46052 - Posted: 25 Apr 2013, 20:25:36 UTC - in response to Message 46050.  

That model has failed well short of the finish.
It's from 22 November 2012, and has failed on all computers that ran it.
Just abort it and save electricity.


ID: 46052 · Report as offensive     Reply Quote
3rkko

Send message
Joined: 12 Feb 08
Posts: 66
Credit: 4,877,652
RAC: 0
Message 46054 - Posted: 26 Apr 2013, 20:20:41 UTC - in response to Message 46049.  

The last trickle is at 75% time-step. The model must have crashed at one of those "sensitive" 25,50,75% steps but still somehow continued even though it was no longer doing any usefull work.
ID: 46054 · Report as offensive     Reply Quote
old_user514136

Send message
Joined: 24 Apr 08
Posts: 6
Credit: 176,830
RAC: 0
Message 46181 - Posted: 10 May 2013, 21:18:38 UTC

I appear to be having a similar problem and was wondering if mine might also might be one of those problematic tasks. The detail page is at

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=15613742 and it's name is hadcm3n_4it3_1940_40_008311085_1

It has been at 100% for a few days and still running and consuming CPU (according to my laptop stats), but the numbers on the properties page are not increasing.

I'm guess this one should be taken out back and shot, but wanted to confirm first, just in case.
ID: 46181 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 46182 - Posted: 10 May 2013, 22:02:32 UTC - in response to Message 46181.  

The last Timestep for that one is 1,036,800, so it's finished, but it just doesn't want to stop.
There have been a few like that, and I've had 1 or 2 myself.

So, yes, you'll have to kill it off yourself.


Backups: Here
ID: 46182 · Report as offensive     Reply Quote
old_user514136

Send message
Joined: 24 Apr 08
Posts: 6
Credit: 176,830
RAC: 0
Message 46185 - Posted: 10 May 2013, 23:45:31 UTC - in response to Message 46182.  

Thought so but thanks for the confirmation
ID: 46185 · Report as offensive     Reply Quote
Nigel Garvey

Send message
Joined: 5 May 10
Posts: 69
Credit: 1,169,103
RAC: 2,258
Message 46560 - Posted: 1 Jul 2013, 21:42:11 UTC

I've recently aborted an hadcm3n which was pronounced finished after its penultimate trickle-up but which kept going for a day or so until I zapped it. There's still a 1.6 GB folder in the project folder with the name of the aborted task, which resetting the project hasn't cleared. I presume it's OK to delete this and the associated XML file manually?

NG
ID: 46560 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 46561 - Posted: 1 Jul 2013, 22:35:34 UTC - in response to Message 46560.  

That model didn't finish. But that's yet another failure mode with this model type: get very close to the end and then fail for some reason. Bad luck there.
(I guess that you're talking about the BOINC manager saying finished; i.e. at 100%. This happens when it stops getting info back from the model, even if the model hasn't actually finished.)

So yes, you have to get rid of that folder and file manually.


ID: 46561 · Report as offensive     Reply Quote
Nigel Garvey

Send message
Joined: 5 May 10
Posts: 69
Credit: 1,169,103
RAC: 2,258
Message 46563 - Posted: 1 Jul 2013, 23:22:22 UTC - in response to Message 46561.  

Thanks, Les. Yes I did mean it was showing in BOINC Manager as completed.

Bad luck there.

While it would have been nice to have completed it properly and to have uploaded the final data, I suppose I did get a lot further with it than the two previous recipients! :)

NG
ID: 46563 · Report as offensive     Reply Quote
Nigel Garvey

Send message
Joined: 5 May 10
Posts: 69
Credit: 1,169,103
RAC: 2,258
Message 46580 - Posted: 3 Jul 2013, 8:59:34 UTC - in response to Message 46560.  

I wrote:
I've recently aborted an hadcm3n which was pronounced finished after its penultimate trickle-up but which kept going for a day or so until I zapped it.


I notice that the information about the task has now been updated on the Web site. The trickle at timestep 1,036,800 was returned and I've received the full credit. It seems only the last batch of data wasn't uploaded. The Sterr is full of raving about environment variables being ignored.
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=15793995

NG
ID: 46580 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1079
Credit: 6,904,049
RAC: 6,657
Message 46582 - Posted: 3 Jul 2013, 12:32:16 UTC - in response to Message 46580.  

... The Sterr is full of raving about environment variables being ignored.
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=15793995...
That's a Mac thing and neither CPDN- nor BOINC-specific - at least, last time I looked.
ID: 46582 · Report as offensive     Reply Quote

Message boards : Number crunching : Complete, but still running.

©2024 climateprediction.net