climateprediction.net home page
Project has no tasks available

Project has no tasks available

Message boards : Number crunching : Project has no tasks available
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4314
Credit: 16,380,160
RAC: 3,563
Message 45313 - Posted: 6 Dec 2012, 18:17:44 UTC

Interesting message in event log today

[error] No start tag in scheduler reply


I guess it makes a change from No tasks available. The machine in question is currently stocked up with WCG tasks for the next four or five days. Just interested in the meaning of the message.
ID: 45313 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 45319 - Posted: 6 Dec 2012, 23:02:21 UTC - in response to Message 45313.  

Just interested in the meaning of the message.

It means that the scheduler reply didn't contain a <scheduler_reply> tag. It's supposed to start with that, so the reply must have been corrupt or (more likely) empty.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 45319 · Report as offensive     Reply Quote
old_user114245

Send message
Joined: 24 Nov 05
Posts: 2
Credit: 309,254
RAC: 0
Message 45608 - Posted: 6 Mar 2013, 14:34:43 UTC - in response to Message 45260.  

Looks like the lack of WU's has ended. Now I am getting 3 or 4 at a time and they are processing faster than they did with my old 2 CPU configuration. Also more of them are completing without ABENDS despite the instability of Windows 8 which ABENDS 4 or 5 times a day lately, probably because of too old device drivers. I thought going to WIN 8 was going to make my system more stable, boy was I ever wrong!
I've gone from a bit over 122,000 in credits to now having over 242,000 in credits in just a couple of months so it looks like I'm finally making progress even though I am processing tasks in SETI, LHC, Rosetta, World Community Grid processing a half dozen different sub-projects, Cosmology, Einstein, Milkyway and Lattice Project in addition to Climate Predict. They each are sending dozens of WU's at a time to process (except LHC which still doesn't send a lot of work and still I have been able to process as many credits in the past 2-3 months as I had been able to in previous SEVEN YEARS I have been processing CP projects. The processing speed of my 4 CPU configuration is really amazing despite the unreliable nature of WIN 8 that has been so shaky that I have had to cease processing anything for days at a time instead of 24/7 as previously.

ID: 45608 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1079
Credit: 6,906,534
RAC: 6,466
Message 45609 - Posted: 6 Mar 2013, 17:44:46 UTC
Last modified: 6 Mar 2013, 17:50:35 UTC

While it is true that a failed task is of some use to the project, a completed task is much more valuable - and more satisfying for volunteers as well. Despite a lot of effort by that machine, it has finished only three out of 31 tasks. There must be something seriously wrong if the success rate is only 10%.

If the machine is crashing four or five times a day then it really isn't going to do well with tasks of CPDN's size and duration. It would be a very good idea to find out what the problem is and to resolve it. Quite apart from any concerns about distributed computing, no machine should crash that often: the machine itself looks very capable.
ID: 45609 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 45613 - Posted: 7 Mar 2013, 0:51:15 UTC

Yes, all those crashes mean there's a problem in that computer. I'd be very surprised if the instability is due solely or even mainly to Win 8 which was made for desktops and laptops just as much as for mobile devices.

I've looked through the error codes and stderr reports of nearly all the crashed models on the first two pages of your computer's results. In many cases I've also looked at the pages for the workunits to see how other computers managed with the same models.

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/results.php?hostid=1261663

Could I make a few points and suggestions.

* To see the stderr report of a model, go to a Task page then click on stderr+ to see the details.

* Don't spend time trying trying to look at the Task pages of Hadam regional models as they often won't open up. We can get plenty info from looking at the Hadcm Task pages.

* The models that crashed with exit code 25 will almost certainly have ended because your computer crashed.

* A small number of the models that crashed with exit code 22 have 5 or 6 instances of INVALID THETA at the end of the report. This is almost always due to the the model itself producing impossible climate conditions, so that's the fault of the models, not your computer.

* However, quite a few models with exit code 22 did not crash with impossible climate. I think the problem in these cases is probably instability of the computer.

* Two or three models crashed with code 193. Jorden explains this error in his FAQs. This seems to point to a memory or RAM problem. Your computer has lots of RAM. Test it. Windows 7 has its own memory-testing/diagnostics program and I expect Win 8 has too.

If it fails the test using all RAM modules together, rerun the test with each of the modules in turn. If you find a faulty module, download MEMTEST and use that to double-check.

Are all the RAM modules the same type? A previous computer of mine ran beautifully with type A and beautifully with type B. But A + B produced crashes.

Please let us know what's happening.
Cpdn news
ID: 45613 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4314
Credit: 16,380,160
RAC: 3,563
Message 45615 - Posted: 7 Mar 2013, 9:05:12 UTC - in response to Message 45613.  

Just to echo Mo's point, I have had two computers in the past that didn't like a mix of memory types that were each fine on their own and in yet another computer were completely happy together. Never did work out what was different about the computer that let them play together - playing with memory bus speed didn't seem to make any difference in fact the one that let them play was a duron processor that still let them play with a substantial overclock.
ID: 45615 · Report as offensive     Reply Quote
old_user671679

Send message
Joined: 30 Jan 12
Posts: 38
Credit: 10,197,388
RAC: 0
Message 45653 - Posted: 12 Mar 2013, 21:27:44 UTC
Last modified: 12 Mar 2013, 21:56:14 UTC

Well, 3600 new jobs and counting, European Region models, I hope they all work okay (I'm sure they will).

Edit: Over 5000 now, life is good ATM.
ID: 45653 · Report as offensive     Reply Quote
old_user437754

Send message
Joined: 20 Mar 07
Posts: 1
Credit: 432,097
RAC: 0
Message 46125 - Posted: 30 Apr 2013, 2:27:03 UTC - in response to Message 45653.  

April 29th and no work again?
ID: 46125 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 46126 - Posted: 30 Apr 2013, 3:21:57 UTC - in response to Message 46125.  

As has been said often, the work on this project isn't continuous.
It's in batches of a few thousand, and then there's a wait until enough results are returned for the climate physicists to decide what they want to do next.
And with about 30 thousand computers attached, not everyone will get work when it IS available.

Also, as mentioned in the News posts, and also in several discussion threads, there are a few problems at present.



Backups: Here
ID: 46126 · Report as offensive     Reply Quote
old_user2033

Send message
Joined: 27 Aug 04
Posts: 14
Credit: 763,720
RAC: 0
Message 47792 - Posted: 13 Dec 2013, 20:28:33 UTC - in response to Message 45294.  

I had four tasks running on my four cores, until they all crashed... Bye bye CPDN.

Task ID Werkeenheid ID
15462521 8409130
15454013 8402208
15453805 8406283
15453417 8406273

On the two iMacs I had, every single workunit crashed upon reboot. And now after waiting for so long I had new work for my Win7 laptop and it crashed too.
There are other, less frustrating projects available on BOINC.


I came back last week to try it again, this time only with my Win7 laptop. The 4 WUs got to about 50% and the next thing I saw was that they had all crashed. What a waste of cpu-time and energy again. Again, bye bye. Don't think I will ever return if this is the level of quality you can offer.
ID: 47792 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 47801 - Posted: 17 Dec 2013, 19:00:57 UTC - in response to Message 47792.  
Last modified: 17 Dec 2013, 19:02:29 UTC

... I came back last week to try it again, this time only with my Win7 laptop. The 4 WUs got to about 50% and the next thing I saw was that they had all crashed. What a waste of cpu-time and energy again. Again, bye bye. Don't think I will ever return if this is the level of quality you can offer.


The tasks haven't reported back yet & hence we can't see why they crashed. If they all crashed simultaneously that usually implies something environmental (such as a power cut or windows shutting down before Boinc has exited). If it was a laptop, perhaps it tried to hibernate the tasks.

Because they run for so long, CPDN tasks do require some TLC.
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 47801 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 47805 - Posted: 18 Dec 2013, 1:44:30 UTC

I have found that the tasks usually survive an accidental trip through hibernation. Last week there was a power cut during the middle of the night. Both of my laptops went into hibernation when their batteries were exhausted. (Batteries only last about 2 hours when the machines is being flogged as hard as Boinc does.) Automatic shutdown is set for 20% charge. When the machines were restarted all the models had survived.

I know that using hibernation is not a good idea with Boinc, but, it is a lot better than just having the computer run until the battery is dead and then crash. That�s just about a guaranteed model killer.

ID: 47805 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 47905 - Posted: 1 Jan 2014, 4:54:41 UTC
Last modified: 1 Jan 2014, 4:56:34 UTC

Getting new work.

There have been a lot of posts here about the fact that the project often has no work available. If you look at the �Server Status� page you will see that it reads �0� except for a few Hadam3�s that are almost certainly �zombies� that will fail in the download phase.

Despite the fact that technically the project has no work, I have picked up 3 hadcm3n WU�s in the last few days. These WU�s are all reissues of WU�s that failed on other machines. They all end in _3 or _4. They have been around the block a couple of times already.

In order to get these it is necessary that Boinc be running (duh). If you don�t have any work from CPDN run something. Run 24/7 (with an internet connection) if you can. The reissues are generated in very small numbers as they timeout and they are snapped up just as fast. The more you run the better the change. Running 24/7 the chances of catching one is greatly increased.
ID: 47905 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 47907 - Posted: 1 Jan 2014, 10:22:42 UTC
Last modified: 1 Jan 2014, 10:27:58 UTC

Thanks for the suggestions, Jim. Some extra suggestions:

* increase the work buffer to 10 days (though you have to make sure you don't get too much work from too many projects).

* if you see that tasks are available and you're really keen to grab some you can temporarily suspend tasks from OTHER projects

* if for any reason you suspend work from a project, BOINC prevents that project from fetching new tasks

* if you can't get hold of new CPDN tasks, do consider joining other projects as well. Find them in the Tools menu of BOINC Manager. The projects listed there are all considered safe and reputable by the people in charge of BOINC at the Uni of California at Berkeley

* check in the climateprediction.net preferences of your account that you've enabled all the model types you want. At the moment the model types are Hadcm3m, which is longish, and all regions of Hadam3p (Europe, Pacific North West, South Africa and we hope some Australia & NZ). If you want anything available just enable them all

* if you're running BOINC tasks on a laptop make sure you're not letting it overheat by simultaneously running too many tasks for the machine's fans to cope with. Check temps by downloading (for example) Core Temp and also, if you're running GPU tasks for another project, GPU Temp. If you don't like the look of the temperatures, members here will advise you about what to do
Cpdn news
ID: 47907 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 47935 - Posted: 6 Jan 2014, 6:46:58 UTC

It's also important to remember that CPDN has reduced the number of times a computer can request extra work from the server to once per hour. We cannot change this setting which was decided to limit the load on the server. The countdown to the next hourly attempt is shown in the Projects tab of BOINC Manager. Do not try to ask for work now by clicking the Update button as this will reset the time to 60 minutes. Patience rules!

And of course to see what models are available go to the Server Status link in the blue menu to the left.
Cpdn news
ID: 47935 · Report as offensive     Reply Quote
Profile clif9710

Send message
Joined: 18 Feb 11
Posts: 44
Credit: 9,975,761
RAC: 0
Message 47943 - Posted: 8 Jan 2014, 0:58:44 UTC - in response to Message 47935.  

It's also important to remember that CPDN has reduced the number of times a computer can request extra work from the server to once per hour. We cannot change this setting which was decided to limit the load on the server. The countdown to the next hourly attempt is shown in the Projects tab of BOINC Manager. Do not try to ask for work now by clicking the Update button as this will reset the time to 60 minutes. Patience rules!

And of course to see what models are available go to the Server Status link in the blue menu to the left.


I always wondered why a request automatically resulted in a one-hour delay. Now I'm enlightened! : )
ID: 47943 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 47945 - Posted: 8 Jan 2014, 1:59:20 UTC - in response to Message 47943.  

Too many computers, not enough models, too many people trying to grab large numbers of them.
This way the work is shared a bit better.

ID: 47945 · Report as offensive     Reply Quote
3rkko

Send message
Joined: 12 Feb 08
Posts: 66
Credit: 4,877,652
RAC: 0
Message 48011 - Posted: 19 Jan 2014, 21:49:31 UTC

If a climate scientist working on computational models does not know how to use free computing capacity, I would fire him. But as it is, we�ll just wait for work�
ID: 48011 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 48012 - Posted: 19 Jan 2014, 22:49:57 UTC

The way I understand it, it is not free to the Scientists. They have to pay the people at CPDN to generate the models for them and manage the data collection.
Only the running of the WU's by us is free.
ID: 48012 · Report as offensive     Reply Quote
Art Masson
Avatar

Send message
Joined: 16 Oct 11
Posts: 254
Credit: 15,954,577
RAC: 18
Message 48057 - Posted: 26 Jan 2014, 15:26:59 UTC - in response to Message 47935.  

Hi Mo,

I'm running BOINC Manager 7.2.38. I do not see the countdown information you reference in the Project Tab...The fields are: Project, Account, Team, Work Done, Average Work Done, Resource Share, and Status (which is blank). Where is the countdown time displayed?

Art Masson
ID: 48057 · Report as offensive     Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

Message boards : Number crunching : Project has no tasks available

©2024 climateprediction.net