climateprediction.net home page
Longer Deadlines - That's What I'd Like To See

Longer Deadlines - That's What I'd Like To See

Questions and Answers : Wish list : Longer Deadlines - That's What I'd Like To See
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile ex_brit
Avatar

Send message
Joined: 26 Aug 04
Posts: 84
Credit: 351,331
RAC: 0
Message 47046 - Posted: 15 Sep 2013, 10:09:55 UTC
Last modified: 15 Sep 2013, 10:19:27 UTC

I've noticed lately that work units' deadlines are earlier than they used to be and somewhat unrealistic. We aren't all running supercomputers 24/7 and not all running CPDN only.

My current work unit which is now running at high-priority, meaning something else is waiting as a result, has run, at last check, for 213.13.30 with 147.34.56 to go and a deadline of 25/09/2013 which by my calculations it will not meet.

I recall much longer deadlines being given in the past which meant the work could progress at normal priority and get done, even allowing for other projects, and the fact that I like to conserve electricity, and money, by shutting down at night, not to mention give my machine a well-earned rest.

Why the sudden urgency?
Peter
Toronto, Canada
ID: 47046 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 47049 - Posted: 15 Sep 2013, 17:43:46 UTC - in response to Message 47046.  

There was a time when one participant ran the full 160-year model on a single machine. Thanks to years of whining/whinging on the boards, the models were broken into four tasks each, with consequent larger down-/up-loads to pass start-up dumps between machines. (We all know what the added load did to the servers' lifespans.)

Deadlines were shortened to allow reasonable completion dates for the four-task set -- can't start the second 40 years until the first 40 are in hand, etc. It's a balancing act, science requirements vs. processing realities.

Not so sudden, really. It's been like this for quite a while.
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 47049 · Report as offensive     Reply Quote
Profile ex_brit
Avatar

Send message
Joined: 26 Aug 04
Posts: 84
Credit: 351,331
RAC: 0
Message 47050 - Posted: 15 Sep 2013, 18:27:37 UTC - in response to Message 47049.  
Last modified: 15 Sep 2013, 19:06:00 UTC

Well I didn't mean to come across as whinging and whining - just opining and asking. ;-)

Thanks for the explanation. Not sure I totally understand it but I'll take your word for it.

I guess I'm thinking back quite a while. Not sure I'm happy with any project that decides to pre-empt other projects by going high-priority right from the start but it seems to be becoming more and more prevalent so I guess we have to put up with it.

It could mean people cutting back on numbers of projects and/or how much cache they decide to keep.
Peter
Toronto, Canada
ID: 47050 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 47051 - Posted: 15 Sep 2013, 20:17:19 UTC


Keep in mind that:
a) the deadlines aren't enforced - the models will still be accepted regardless.
b) in the long run, CPDN will still end up with your preferred resource-share even if it has a high-priority task. It'll grab more than its fair share in the short term, but then other projects will get the priority for a long while until it has evened out again.



I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 47051 · Report as offensive     Reply Quote
Profile ex_brit
Avatar

Send message
Joined: 26 Aug 04
Posts: 84
Credit: 351,331
RAC: 0
Message 47052 - Posted: 15 Sep 2013, 20:23:59 UTC - in response to Message 47051.  
Last modified: 15 Sep 2013, 20:24:14 UTC

OK, thanks.
Peter
Toronto, Canada
ID: 47052 · Report as offensive     Reply Quote
Profile Greg van Paassen

Send message
Joined: 17 Nov 07
Posts: 142
Credit: 4,271,370
RAC: 0
Message 47057 - Posted: 16 Sep 2013, 9:38:06 UTC - in response to Message 47049.  

science requirements vs. processing realities. [...]Not so sudden, really. It's been like this for quite a while.

Processing realities, indeed. And limitations of BOINC when used for long duration tasks.

The problem that CPDN has, is with task re-issue. The BOINC system won't re-issue a task until either the client reports failure, or the deadline date has passed. The second case is unfortunately quite common with CPDN's work, even with generous deadlines: a volunteer computer starts on a task, and then ... gets redeployed to do something else. As far as CPDN knows, the task is still "in progress".

If CPDN has set the deadline years into the future, it has to wait years just to learn that it needs to re-issue the task ... and then wait some more. But even scientists have to work to a schedule. ;-)

Until reasonably recently, the way that CPDN compensated for this time-out problem was to issue tasks to 5 or 7 computers at a time. The odds were that one of them would process the task to completion.

But of course the odds were also that more than one, or more than two, of them would process the task to completion, too. There was duplication of effort by crunchers, and extra data transmission, processing and storage costs for the project.

The new method, of short deadlines and issuing work to only one computer (reissuing if timed out or failed), fixes these problems (and may have been intended to). But now we have this "high priority" problem...

It's all an illustration of Eric Sevareid's line, "the chief cause of problems is solutions".
ID: 47057 · Report as offensive     Reply Quote
Profile ex_brit
Avatar

Send message
Joined: 26 Aug 04
Posts: 84
Credit: 351,331
RAC: 0
Message 47059 - Posted: 16 Sep 2013, 11:12:44 UTC - in response to Message 47057.  
Last modified: 16 Sep 2013, 11:59:37 UTC

"the chief cause of problems is solutions"
- yes indeed. At least CPDN is not carrying it to extremes as did another project I since left whereby several WU's would regularly arrive with just hours left for the deadline, naturally all would run at high priority. I complained and explained in detail why and was told, "tough", so left them to their own devices.
I guess it's striking a happy balance between deadline and the level of urgency.

However, contrary to an earlier statement, things don't seem to level off because CPDN's WU continues to run 100% of the time, so it really is depriving something else of "having a go" when the hourly changeover occurs and will continue to do so it seems, until 25 September deadline and beyond. It's only 1 WU so I guess it's not too serious a problem as I can run 8 at once.
Peter
Toronto, Canada
ID: 47059 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 47060 - Posted: 16 Sep 2013, 12:47:00 UTC - in response to Message 47059.  

... However, contrary to an earlier statement, ...


Which statement are you referring to? If it's mine, 'short term' = duration of the WU, 'long term' is the year or so afterwards that it will take the processing-debt to sort itself out.


I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 47060 · Report as offensive     Reply Quote
Profile ex_brit
Avatar

Send message
Joined: 26 Aug 04
Posts: 84
Credit: 351,331
RAC: 0
Message 47061 - Posted: 16 Sep 2013, 12:57:44 UTC - in response to Message 47060.  
Last modified: 16 Sep 2013, 12:58:46 UTC

Sorry, I was a bit vague. I was referring to
It'll grab more than its fair share in the short term, but then other projects will get the priority for a long while until it has evened out again.


In my experience that doesn't happen. That could be a problem with BOINC itself.
Peter
Toronto, Canada
ID: 47061 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 47062 - Posted: 16 Sep 2013, 13:41:02 UTC
Last modified: 16 Sep 2013, 13:41:56 UTC

What I meant was this:

It'll grab more than its fair share until the workunit has finished, but then other projects will get the priority for a long while until it has evened out again.


Boinc has a complicated 'debt' system which means that CPDN will 'owe' the other projects a lot of CPU time. Until that debt has been paid back, Boinc should prevent CPDN from downloading new units.

But it has been years since I looked at this 'debt' functionality last. It may have changed.
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 47062 · Report as offensive     Reply Quote
Profile ex_brit
Avatar

Send message
Joined: 26 Aug 04
Posts: 84
Credit: 351,331
RAC: 0
Message 47063 - Posted: 16 Sep 2013, 13:51:10 UTC - in response to Message 47062.  

It's a weird system but must make sense to someone I suppose. Of course hanging on until that work is finished means others may miss a deadline as a result, but I'm learning (slowly) to disregard it.

Some project managers unfortunately exploit the situation as I mentioned earlier.
Peter
Toronto, Canada
ID: 47063 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 47064 - Posted: 16 Sep 2013, 14:09:19 UTC


A lot of it depends on how many Boinc projects you are running simultaneously. The more there are on a single machine, the more likely that work units are going to go into high priority mode.


I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 47064 · Report as offensive     Reply Quote
Profile ex_brit
Avatar

Send message
Joined: 26 Aug 04
Posts: 84
Credit: 351,331
RAC: 0
Message 47065 - Posted: 16 Sep 2013, 14:34:08 UTC - in response to Message 47064.  

Agreed and I have a feeling I'm guilty of that. I signed on to a lot of projects simply to guarantee work, but maybe I went overboard.
Peter
Toronto, Canada
ID: 47065 · Report as offensive     Reply Quote
Profile ex_brit
Avatar

Send message
Joined: 26 Aug 04
Posts: 84
Credit: 351,331
RAC: 0
Message 47068 - Posted: 17 Sep 2013, 0:38:40 UTC - in response to Message 47065.  

Well I've been proven wrong and I apologise. CPDN just relinquished it's grip and now some other work, due tomorrow, has taken over.
Peter
Toronto, Canada
ID: 47068 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 47073 - Posted: 17 Sep 2013, 5:01:31 UTC

One of the things that I have noticed about the tendency of Boinc to go into �high priority� mode is the size of the work buffer. I mainly run CPDN. When I am trying to get new work from CP I keep the work buffer at 10 days. When I run another project (usually when CPDN was no work) I reset the buffer at 1 day and go to WCG.

At 1 day on the buffer settings CPDN and WCG projects such as SN2S will share the available cores. If however I reset the work buffer to 10 days (with WCG set to �no new tasks�) WCG will immediately go into �high priority� and not allow CPDN to run. There has been no change in the number of tasks the machine has to finish before WCG deadlines, but, it suddenly gets paranoid about finishing in time.

ID: 47073 · Report as offensive     Reply Quote
Profile ex_brit
Avatar

Send message
Joined: 26 Aug 04
Posts: 84
Credit: 351,331
RAC: 0
Message 47079 - Posted: 17 Sep 2013, 10:27:23 UTC - in response to Message 47073.  

BOINC is a work in progress obviously.
ID: 47079 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 47080 - Posted: 17 Sep 2013, 11:29:11 UTC - in response to Message 47079.  
Last modified: 17 Sep 2013, 12:52:30 UTC


BOINC is a work in progress obviously.


There are a lot of things about Boinc that I am personally unhappy with. It was designed for running short, disposable jobs which can be validated by bytewise comparison of result files, and does not cope well with jobs which can last for weeks or even months of CPU time.

However, it's simple, widespread, and easy to set up, and thats why CPDN uses it. The bottom line is that CPDN benefits overall using Boinc, despite the various issues.
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 47080 · Report as offensive     Reply Quote
Profile ex_brit
Avatar

Send message
Joined: 26 Aug 04
Posts: 84
Credit: 351,331
RAC: 0
Message 47866 - Posted: 25 Dec 2013, 12:38:04 UTC - in response to Message 47080.  
Last modified: 25 Dec 2013, 12:45:27 UTC

Yet another WU is about to miss it's deadline...a day from now and it's only 65% there, so has weeks to go at the rate it is going, even at high-priority (at which it has been running for quite some time now so BOINC agrees with me, or at least it can't be blamed). This never used to happen and I've always had multi-projects.
Are you absolutely sure that longer deadlines aren't in order or possible? Am I the only one this is happening to?
I seem to recall deadlines of almost a year at one time.
ID: 47866 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 47867 - Posted: 25 Dec 2013, 14:34:01 UTC - in response to Message 47866.  

The old 160 year hadcm WU�s had a deadline of about 14 months. Even the relatively short hadam3p WU�s have a deadline of about 9 months. The hadcm3n (aka. RAPIT) on the other hand only give about a 3 month deadline. This is because of the segmental nature of the models. The results of each segment are used to generate the next. The researchers don�t want to wait forever for their results.

The only solution that I can think of (short of buying a faster computer) is either to run more hours each day or restrict to your machine to the hadam3p�s.

Computers don�t really need to rest and every time you shut down (even doing it the �right� way) there is a small chance of the Hadcm3n�s crashing. I have found that my computer running 24/7 uses about $6 USD per month. Shutting down 6 hours/day would only save me about $2/month.

ID: 47867 · Report as offensive     Reply Quote
Profile ex_brit
Avatar

Send message
Joined: 26 Aug 04
Posts: 84
Credit: 351,331
RAC: 0
Message 47868 - Posted: 25 Dec 2013, 14:38:10 UTC - in response to Message 47867.  
Last modified: 25 Dec 2013, 14:41:42 UTC

Thanks. I do run it most of the time anyway and it is a fast machine however I do prevent BOINC tasks from using my graphics card which may or may not apply here. I'll look into your suggestions.
ID: 47868 · Report as offensive     Reply Quote
1 · 2 · Next

Questions and Answers : Wish list : Longer Deadlines - That's What I'd Like To See

©2024 climateprediction.net