climateprediction.net home page
Complete, but still running.
Complete, but still running.
log in

Advanced search

Message boards : Number crunching : Complete, but still running.

Author Message
Profile adrianxw
Avatar
Send message
Joined: 31 Aug 04
Posts: 118
Credit: 1,748,281
RAC: 0
Message 46047 - Posted: 25 Apr 2013, 16:04:42 UTC

My current model completed this morning some time, when I looked at about 08:00 it was 100%, remaining ---, but was still running. I've seen similar before as finished jobs write and compress their result files etc. but it is still running now some 10 hours later.
____________
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.

Profile Byron Leigh Hatch @ team Carl Sagan
Avatar
Send message
Joined: 17 Aug 04
Posts: 282
Credit: 43,419,373
RAC: 227
Message 46049 - Posted: 25 Apr 2013, 18:14:18 UTC - in response to Message 46047.

Hi adrianxw,
is this your Model your speaking about ? hadcm3n_zi2g_1880_40_008249779
I put up this link in case it might help the Forum Moderators.

Profile adrianxw
Avatar
Send message
Joined: 31 Aug 04
Posts: 118
Credit: 1,748,281
RAC: 0
Message 46050 - Posted: 25 Apr 2013, 20:01:41 UTC - in response to Message 46049.
Last modified: 25 Apr 2013, 20:03:50 UTC

Yes, that is the one. It is still running now.

<edit>
Something else I just noticed, it was sending trickles up regularly, but they seemed to stop a few days ago.
____________
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.

Les Bayliss
Volunteer moderator
Send message
Joined: 5 Sep 04
Posts: 6925
Credit: 20,843,205
RAC: 1
Message 46052 - Posted: 25 Apr 2013, 20:25:36 UTC - in response to Message 46050.

That model has failed well short of the finish.
It's from 22 November 2012, and has failed on all computers that ran it.
Just abort it and save electricity.


3rkko
Send message
Joined: 12 Feb 08
Posts: 66
Credit: 4,877,652
RAC: 0
Message 46054 - Posted: 26 Apr 2013, 20:20:41 UTC - in response to Message 46049.

The last trickle is at 75% time-step. The model must have crashed at one of those "sensitive" 25,50,75% steps but still somehow continued even though it was no longer doing any usefull work.

old_user514136
Send message
Joined: 24 Apr 08
Posts: 6
Credit: 176,830
RAC: 0
Message 46181 - Posted: 10 May 2013, 21:18:38 UTC

I appear to be having a similar problem and was wondering if mine might also might be one of those problematic tasks. The detail page is at

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=15613742 and it's name is hadcm3n_4it3_1940_40_008311085_1

It has been at 100% for a few days and still running and consuming CPU (according to my laptop stats), but the numbers on the properties page are not increasing.

I'm guess this one should be taken out back and shot, but wanted to confirm first, just in case.

Les Bayliss
Volunteer moderator
Send message
Joined: 5 Sep 04
Posts: 6925
Credit: 20,843,205
RAC: 1
Message 46182 - Posted: 10 May 2013, 22:02:32 UTC - in response to Message 46181.

The last Timestep for that one is 1,036,800, so it's finished, but it just doesn't want to stop.
There have been a few like that, and I've had 1 or 2 myself.

So, yes, you'll have to kill it off yourself.


____________
Backups: Here

old_user514136
Send message
Joined: 24 Apr 08
Posts: 6
Credit: 176,830
RAC: 0
Message 46185 - Posted: 10 May 2013, 23:45:31 UTC - in response to Message 46182.

Thought so but thanks for the confirmation

Nigel Garvey
Send message
Joined: 5 May 10
Posts: 51
Credit: 825,058
RAC: 0
Message 46560 - Posted: 1 Jul 2013, 21:42:11 UTC

I've recently aborted an hadcm3n which was pronounced finished after its penultimate trickle-up but which kept going for a day or so until I zapped it. There's still a 1.6 GB folder in the project folder with the name of the aborted task, which resetting the project hasn't cleared. I presume it's OK to delete this and the associated XML file manually?

NG

Les Bayliss
Volunteer moderator
Send message
Joined: 5 Sep 04
Posts: 6925
Credit: 20,843,205
RAC: 1
Message 46561 - Posted: 1 Jul 2013, 22:35:34 UTC - in response to Message 46560.

That model didn't finish. But that's yet another failure mode with this model type: get very close to the end and then fail for some reason. Bad luck there.
(I guess that you're talking about the BOINC manager saying finished; i.e. at 100%. This happens when it stops getting info back from the model, even if the model hasn't actually finished.)

So yes, you have to get rid of that folder and file manually.


Nigel Garvey
Send message
Joined: 5 May 10
Posts: 51
Credit: 825,058
RAC: 0
Message 46563 - Posted: 1 Jul 2013, 23:22:22 UTC - in response to Message 46561.

Thanks, Les. Yes I did mean it was showing in BOINC Manager as completed.

Bad luck there.

While it would have been nice to have completed it properly and to have uploaded the final data, I suppose I did get a lot further with it than the two previous recipients! :)

NG

Nigel Garvey
Send message
Joined: 5 May 10
Posts: 51
Credit: 825,058
RAC: 0
Message 46580 - Posted: 3 Jul 2013, 8:59:34 UTC - in response to Message 46560.

I wrote:
I've recently aborted an hadcm3n which was pronounced finished after its penultimate trickle-up but which kept going for a day or so until I zapped it.


I notice that the information about the task has now been updated on the Web site. The trickle at timestep 1,036,800 was returned and I've received the full credit. It seems only the last batch of data wasn't uploaded. The Sterr is full of raving about environment variables being ignored.
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=15793995

NG

Profile Iain Inglis
Volunteer moderator
Send message
Joined: 16 Jan 10
Posts: 979
Credit: 3,102,089
RAC: 1,353
Message 46582 - Posted: 3 Jul 2013, 12:32:16 UTC - in response to Message 46580.

... The Sterr is full of raving about environment variables being ignored.
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/result.php?resultid=15793995...
That's a Mac thing and neither CPDN- nor BOINC-specific - at least, last time I looked.

Message boards : Number crunching : Complete, but still running.


Main page · Your account · Message boards


Copyright © 2019 climateprediction.net