climateprediction.net home page
exiting project after request of benchmarks

exiting project after request of benchmarks

Questions and Answers : Windows : exiting project after request of benchmarks
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile gandhi

Send message
Joined: 5 Aug 04
Posts: 22
Credit: 7,271,105
RAC: 0
Message 14482 - Posted: 18 Jul 2005, 5:09:05 UTC

while ab few days both of my 24/7-maschines gave the same problem: no trickles for more than one day

over the weekend i recognized this, and thougt, that this could be caused by the internetconnection.

just now i saw this in the message-box:
---------------------------
2005-07-14 15:08:14 [---] Suspending computation and network activity - running CPU benchmarks
2005-07-14 15:08:14 [climateprediction.net] Pausing result 14ti_200073453_0 (removed from memory)
2005-07-14 15:08:16 [---] Running CPU benchmarks
2005-07-14 15:08:24 [---] Aborting CPU benchmarks, one or more active tasks are still running.
2005-07-14 15:08:24 [---] Resuming computation and network activity
2005-07-14 15:08:24 [---] request_reschedule_cpus: Resuming activities
2005-07-14 15:08:25 [---] request_reschedule_cpus: process exited
-------------------------

the one maschine dis this at July 14h, the other even at 13th.

why this happened, what can i do, that i haven\'t to look after the processes day by day?

systems:
this computer:
#46462 - WinXP, P4-2.66GHz, 512Mb RAM, 55Gb free Diskspace
other:
#179231 - Win2003Server, P4-3.0GHz HT, 1024mB Ram, 4Gb free diskspace
ID: 14482 · Report as offensive     Reply Quote
crandles
Volunteer moderator

Send message
Joined: 16 Oct 04
Posts: 692
Credit: 277,679
RAC: 0
Message 14485 - Posted: 18 Jul 2005, 8:52:41 UTC
Last modified: 18 Jul 2005, 9:39:13 UTC

See

http://boinc-doc.net/boinc-wiki/index.php?title=Aborting_CPU_benchmarks%2C_one_or_more_active_tasks_are_still_running

I am thinking we need to make this more widely known. At least get it in the FAQ. I know Thyme Lawn and Chris Sutton have reported the problem. Any news on this before I add something to the Wiki FAQ?

BTW you may want to consider changing the preference to keep in memory while suspended for performance reasons. I am not sure but this may also speed up exiting the process; if so it may avoid this problem.
_______________________________
Visit <a href="http://boinc-doc.net/boinc-wiki/index.php?title=Main_Page">BOINC WIKI</a> for help

And join <a href="http://www.boincsynergy.com/">BOINC Synergy</a> for all the news in one place.
ID: 14485 · Report as offensive     Reply Quote
Profile Andrew Hingston
Volunteer moderator

Send message
Joined: 17 Aug 04
Posts: 753
Credit: 9,804,700
RAC: 0
Message 14492 - Posted: 18 Jul 2005, 10:27:32 UTC - in response to Message 14485.  
Last modified: 18 Jul 2005, 10:42:48 UTC

&gt; BTW you may want to consider changing the preference to keep in memory while
&gt; suspended for performance reasons. I am not sure but this may also speed up
&gt; exiting the process; if so it may avoid this problem.

My experience is that it is unlikely to help, but regular rebooting of the computer may do so (I'm guessing, but it may be what stopped it happening for me last time). It's been suggested that it has to do with lots of open file handles, for those who understand these things.

Might be worth mentioning in the Wiki that one consequence is being unable to view the graphics (because although BOINC Manager shows the app to be running, it isn't). People who find that graphics suddenly are not working should check in task manager that the app is actually running.
ID: 14492 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 14506 - Posted: 19 Jul 2005, 15:22:11 UTC

I've pinned down where the problem is in BOINC (it sometimes fails to check that applications have closed down). I've posted debug logs to the BOINC development mailing list with 3 different scenarios:

1) a normal successful benchmark
2) an aborted benchmark which restarts the apps
3) an aborted benchmark which failed to restart the apps

I get the impression I've been pretty much talking to myself in my posts (no replies and no attempts to fix it yet), so I guess it's time for me to try to find time to come up with a fix myself.
<br><a href="http://www.teampicard.net/"><img src="http://www.teampicard.net/images/picardmini.gif"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 14506 · Report as offensive     Reply Quote
ePig

Send message
Joined: 5 Aug 04
Posts: 7
Credit: 1,116,870
RAC: 0
Message 14680 - Posted: 27 Jul 2005, 13:22:47 UTC - in response to Message 14506.  

I'm running into this problem again at the moment. I've tried removing CPDN from the memory and then back again (a few times) - it eventually works, but not reliably - is anyone else still getting this problem?

It's not a problem in itself, but the model eventually stops resuming (as it seems to need benchmarks to stay running after the ~5 day inter-benchmark period, like gandhi above) and stays off until I suspend the model and order a manual benchmark. Rather annoying considering BOINC Manager reckons it's still running the model.

epig

<a href="http://www.boinc.dk/index.php?page=user_statistics&amp;project=cpdn&amp;userid=133"><img border="0" height="60" src="http://www.boinc.dk/auto.php?user=133&amp;project=cpdn&amp;input=1074199139+-+1&amp;layout=1074199139+-+1.jpg"></a>
ID: 14680 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2169
Credit: 64,553,422
RAC: 6,017
Message 14684 - Posted: 28 Jul 2005, 13:10:35 UTC

Well, now a different Windows PC running BOINC 4.45 has done it. PC was sitting there idle for 6.5 hours while it could have been crunching. Not good...

7/27/2005 11:41:24 PM|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
7/27/2005 11:41:24 PM|climateprediction.net|Requesting 0 seconds of work, returning 0 results
7/27/2005 11:41:25 PM|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
7/28/2005 1:30:59 AM||Suspending computation and network activity - running CPU benchmarks
7/28/2005 1:30:59 AM|climateprediction.net|Pausing result 15zi_100074980_1 (removed from memory)
7/28/2005 1:31:01 AM||Running CPU benchmarks
7/28/2005 1:31:09 AM||Aborting CPU benchmarks, one or more active tasks are still running.
7/28/2005 1:31:09 AM||Resuming computation and network activity
7/28/2005 1:31:09 AM||request_reschedule_cpus: Resuming activities
7/28/2005 1:31:10 AM||request_reschedule_cpus: process exited
7/28/2005 8:09:47 AM||Suspending computation and network activity - user request
7/28/2005 8:09:52 AM||Resuming computation and network activity
7/28/2005 8:09:52 AM||request_reschedule_cpus: Resuming activities

ID: 14684 · Report as offensive     Reply Quote
dajashby

Send message
Joined: 1 Sep 04
Posts: 55
Credit: 17,223,688
RAC: 967
Message 14907 - Posted: 3 Aug 2005, 21:58:01 UTC

I'ver had this problem a couple of times, since I upgraded BOINC to run the sulphur model. I've just exited from BOINC and restarted, and it seems to have reliably come good each time. Very annoying, though! Only happens on my Windows machine.
Derrick Ashby
ID: 14907 · Report as offensive     Reply Quote
Profile Andrew Hingston
Volunteer moderator

Send message
Joined: 17 Aug 04
Posts: 753
Credit: 9,804,700
RAC: 0
Message 14908 - Posted: 3 Aug 2005, 23:44:40 UTC

Rebooting Windows may help. I had it happen to both my machines just after I went away for a long weekend, but it has not happened since - either it was Sod's Law, or rebooting made a difference.
ID: 14908 · Report as offensive     Reply Quote
dajashby

Send message
Joined: 1 Sep 04
Posts: 55
Credit: 17,223,688
RAC: 967
Message 14911 - Posted: 4 Aug 2005, 4:09:30 UTC - in response to Message 14908.  

&gt; Rebooting Windows may help. I had it happen to both my machines just after I
&gt; went away for a long weekend, but it has not happened since - either it was
&gt; Sod's Law, or rebooting made a difference.
&gt;
&gt;
The couple of times it's happened to me have been a week or so apart, and I've rebooted between them. Don't know if rebooting as such does anything, except restart the application. I've just manually run the benchmarking process, and the same thing happened - it failed, and the models didn't resume. I then suspended activity, and resumed activity, and that started the models going again. It might be interesting to do this repeatedly and see if the program clags up completely, but I'm not planning to do that,,,



Derrick Ashby
ID: 14911 · Report as offensive     Reply Quote
ePig

Send message
Joined: 5 Aug 04
Posts: 7
Credit: 1,116,870
RAC: 0
Message 14963 - Posted: 7 Aug 2005, 12:42:01 UTC - in response to Message 14911.  
Last modified: 7 Aug 2005, 12:53:48 UTC

&gt; The couple of times it's happened to me have been a week or so apart, and I've
&gt; rebooted between them. Don't know if rebooting as such does anything, except
&gt; restart the application. I've just manually run the benchmarking process, and
&gt; the same thing happened - it failed, and the models didn't resume. I then
&gt; suspended activity, and resumed activity, and that started the models going
&gt; again. It might be interesting to do this repeatedly and see if the program
&gt; clags up completely, but I'm not planning to do that,,,
&gt;

It seems to be just fine when suspending - I've done this manually a lot of times now - it's as if the client can't unload the model fast enough from memory, so it stops benchmarking. The problem is that without the regular benchmark (the clients wants to do it once every 5 days - seems a little too often, doesn't it?), the client refuses to work AT ALL. This is super annoying since I'm running my machine for days/weeks without rebooting. I haven't tried the reboot rememdy yet. Surprised this isn't a more commonly posted problem, or is everyone running the 4.19 client?
epig

<a href="http://www.boinc.dk/index.php?page=user_statistics&amp;project=cpdn&amp;userid=133"><img border="0" height="60" src="http://www.boinc.dk/auto.php?user=133&amp;project=cpdn&amp;input=1074199139+-+1&amp;layout=1074199139+-+1.jpg"></a>
ID: 14963 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2169
Credit: 64,553,422
RAC: 6,017
Message 14965 - Posted: 7 Aug 2005, 13:07:39 UTC - in response to Message 14963.  

&gt; Surprised this isn't a more commonly
&gt; posted problem, or is everyone running the 4.19 client?

See <a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=2921">here</a> for some hope for the next BOINC client.

I wonder how many are seeing the problem, then just stopping and restarting the client.
ID: 14965 · Report as offensive     Reply Quote
Profile old_user15351

Send message
Joined: 8 Sep 04
Posts: 23
Credit: 121,446
RAC: 0
Message 15391 - Posted: 26 Aug 2005, 0:19:48 UTC

Had this behavior quite a few times myself, usually i'm there when it happens and i just restart boinc, not a proper solution, but it works for now, does anyone know what the official berkeley boinc devs are doing about it?

might be worth checking over at the SETI boards to see if anything's mentioned there
ID: 15391 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 15400 - Posted: 26 Aug 2005, 7:16:14 UTC

The BOINC development team are well aware of the problem (Chris Sutton and myself have made sure of that!)

Chris has built a version of BOINC 4.45 with an extended timeout before BOINC aborts the benchmark. It has been extensively tested and hasn't exhibited the same problem so far. Arnaud is very kindly hosting it <a href="http://arnaudboinc.free.fr/">here</a>.
<br><a href="http://www.teampicard.net/"><img src="http://www.teampicard.net/images/picardmini.gif"></a><a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=3">Join us here</a>
ID: 15400 · Report as offensive     Reply Quote
Thunder

Send message
Joined: 1 Sep 04
Posts: 42
Credit: 6,475,117
RAC: 0
Message 15994 - Posted: 14 Sep 2005, 13:14:18 UTC - in response to Message 15400.  

The BOINC development team are well aware of the problem (Chris Sutton and myself have made sure of that!)


I just discovered the same problem on 2 of my 4.45 machines.

I\'m not sure if this helps or not, but I\'ve been noticing it every 4-5 days on these machines, but for over 2 weeks, all of mine that run BOINC as a service have had no problem.

Since it\'s only been 2 weeks or so that I\'ve been closely monitoring the issue, this may be purely anecdotal evidence, but I\'ll keep an eye on it.
ID: 15994 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2169
Credit: 64,553,422
RAC: 6,017
Message 15996 - Posted: 14 Sep 2005, 14:38:22 UTC - in response to Message 15994.  
Last modified: 14 Sep 2005, 14:38:50 UTC

I\'m not sure if this helps or not, but I\'ve been noticing it every 4-5 days on these machines, but for over 2 weeks, all of mine that run BOINC as a service have had no problem.

The benchmark runs by itself every 5 days so that is why you see it when you do. Use 4.45b which Thyme linked to. I\'ve had no problems since I\'ve been using it, as opposed to problems every automatic benchmark when I used 4.45 \"official\".
ID: 15996 · Report as offensive     Reply Quote
Thunder

Send message
Joined: 1 Sep 04
Posts: 42
Credit: 6,475,117
RAC: 0
Message 15998 - Posted: 14 Sep 2005, 17:29:30 UTC - in response to Message 15996.  

The benchmark runs by itself every 5 days so that is why you see it when you do. Use 4.45b which Thyme linked to. I\'ve had no problems since I\'ve been using it, as opposed to problems every automatic benchmark when I used 4.45 \"official\".


Ahhh, thanks for the info. Slowly but surely I\'m getting my head around all the little subtle things this program does.

I believe I will have to get the custom compiled version, because I came in to find that 2 of 3 of the machines at my office that run BOINC as a service had also stopped. (There goes that theory out the window)

At least thanks to the wonders of remote management, it only took 30 sec to get into both and restart the BOINC service. :)
ID: 15998 · Report as offensive     Reply Quote

Questions and Answers : Windows : exiting project after request of benchmarks

©2024 climateprediction.net