climateprediction.net home page
Posts by marmot

Posts by marmot

21) Message boards : Number crunching : Total Credit (Message 53791)
Posted 24 Mar 2016 by marmot
Post:

2. The project itself only asks volunteers to run models in the background, so the additional energy expenditure is the "marginal" increase in electricity consumed and not the entire energy consumption of a PC that the user chooses to leave on 24/7.


This is actually quite wrong.

I have spent many years watching the temperatures and energy used by my machines as they heat my house. A CPU sitting idle will use a couple watts and hover just above room temperature. A machine used for viewing TV shows or browsing uses about 15 to 25% and pushes the curve up to 10 (on a 35W profile) to 40 watts ( on a 135W TDP). A intensive game will maybe drive the cores up to an average 35 to 70% usage and into the 25W(35W profile) to 75W(135W profile) range

4 to 8 Climate WU runs the CPUs up to 100%, drives the power profile up maximum output (35W or 135W) and the temp of the CPU's heads up to 70 to 80C.

GPU's crunching is similar to even more extreme as a machine on 24/7 just doing nothing will have a GPU in black screen and doing nearly nothing. Watching videos or playing games don't drive GPU's to like a 100% usage WU.

These WU are responsible for 70 to 95% of the power usage on any machine that they are run upon. It's not MARGINAL at all.

In 2013, global computing power consumed 10% of the world's electric power generation.

Cloud computing centers are supposedly (company advert is suspect, need a better source) using more power than the entire country of India in 2016.
22) Message boards : Number crunching : Total Credit (Message 53790)
Posted 24 Mar 2016 by marmot
Post:
If you were to look at the News and Announcements thread, I made a post about this a few hours ago.
We've been telling people for years to subscribe to that thread.




Some of us have 30 to 60 projects we are involved in, plus lives outside of BOINC and we aren't going to be checking the News threads.

If you want to get the information to us then the ONLY reliable method is a notice to the BOINC client.


FYI, instead of restoring credit the repair scripts took away even more a few minutes ago.
23) Questions and Answers : Windows : Visual Fortran Run-Time Error (Message 52684)
Posted 6 Oct 2015 by marmot
Post:
@Les Bayliss:

And climateprediction.net does NOT write the code.
It all comes from the UK Met Office, where it normally runs on their super-computers, for daily weather modelling to long term climate modelling.
All of which has been posted about many times over the years.



I'm not sure why you took this tact. You can see that I have 11 posts on these forums and obviously am not deeply involved with these projects so going after my ignorance of the years worth of posts was unusual.

@Les Bayliss:
Climate models don't like being interrupted.
Some model types are more prone to various failures than others.



This kind of fault intolerance after 10 years of climateprediction.net running on BOINC shows some failure in the project. Probably from lack of funding leading to programmers not being able to spend appropriate amounts of time hardening their code for the BOINC environment across a heterogeneous selection of user machines. I have trouble believing that FORTRAN itself hasn't been hardened to run in a multi-core modern OS.

@ryan:
The issue is further compounded because the processes are not properly cleaned up. They stick around taking up memory until the user ends them manually, logs out, or reboots the machine.

I can consistently repeat this problem by suspending tasks then taking up a bunch of extra memory (browsers, office programs, etc) then closing them and resuming the tasks. I get a slew of fortran errors but the tasks stay in Windows process viewer.

Even if models do not like being interrupted don't think it should be too hard to take the few extra milliseconds or seconds to reach a safe stopping point.


Some younger coders need to take some time looking over the apps being sent out to BOINC machines and improve the fault tolerance of the code.
Maybe some student loan forgiveness could be offered.

Maybe these comments need to be taken to a UK Met Office forum or representative since they write the code, might never read any of these forums, and ClimatePrediction.net has no power to make any changes to correct these errors.

24) Questions and Answers : Windows : Visual Fortran Run-Time Error (Message 51857)
Posted 20 Apr 2015 by marmot
Post:
All that is a bit irrelevant, as the Met Office only has apps for desktops/laptops using the x86 instruction set.
There may never be any ARM/RISC version, as professionals want the results of their daily work fast, not in a few weeks/months, as provided by a lot of BOINC users.



Your comment makes little sense as the deadlines for WU on ClimatePrediction is 1 YEAR which is the longest deadline of any project I've ever seen. If work is required from the BOINC network more quickly then smaller slices of work needs to be put out and the deadline severely decreased.

If ClimatePrediction wants to ignore the quickly growing ARM market then they are making a huge mistake as there will be a growing number of people going without desktops or laptops and using only ARM based phone and pads in the next decade.

It's already happening among the college and under crowd. Who needs a laptop when you have a Samsung Note with writing stylus which a student can get discounted. If you want people to run BOINC WU for you for years to come then catch them young and get them involved.

Politics and name recognition are also considerations as climate modeling is crucial to the future of our species.
25) Questions and Answers : Windows : Visual Fortran Run-Time Error (Message 51841)
Posted 16 Apr 2015 by marmot
Post:
Forget the Work unit ID. Look in the first column, the Task ID.
This is where all the important information about each model is stored.
Go down to Stderr, and click on the + symbol to expand the list.


Thanks, found it.



Smart phones aren't powerful enough to run these models, and the UK Met Office doesn't have programs to run on them. Or on GPUs.


I wanted to test that assertion and went to the database of CPU's to look up the GFLOPS of my single core, circa 2005 Intel(R) Celeron(R) M processor 900MHz {Family 6 Model 13 Stepping 8} that completed Task 17549228 in 710 hours.
It's GFLOPS is 0.54 on it's single core.

I looked up the cheap, 2014 Lumia 625 phone based on the Snapdragon 400 (8926) and found it has a 0.09 GFLOP per core and 0.26 on 4 cores.
That's not enough performance to get a Climate WU done within 1500 hours.

The 2014 iPhone 6 is a different story.
It has the A8 dual core CPU with 0.77 GFLOPS per core which is similar performance to a Intel(R) Pentium(R) 4 CPU 1.60GHz.
A return time of about 500 hours on a similar WU that the Celeron M 900MHz completed.

The other popular CPU's in higher end smartphones of 2014 are
the Tegra K1 at 0.67 GFLOPS per core (LINPACK seems to only recognize 2 cores on the multi-thread bench),
the Snapdragon 805 at 0.32 GFLOPS per core and
the Exynos 5420 Octa core CPU with 0.39 GFLOPS per core (again, the LINPACK benchmark seems to only be running on 2 cores and not on at least the 4 A15 cores of the A15/A7 BIG.little architecture).

An iOS BOINC clients for iPhone 6 and later editions are capable of handling ClimatePrediction.net WU's in 500 hours if owners are willing to run them.
I've been running Asteroids and SETI on a Zeepad and it's turnaround is much worse than that level, yet these devices are becoming so prolific that their computing power can't be ignored.
Also, the pad market is increasing enormously and is based on the highest performing RISC based CPU's and running predominantly Android and iOS.
Something I didn't look at, but should be significant, is the amount of energy per GFLOP required on these devices compared to desktops and laptops. Completing the WU for much lower energy costs would ease the burden of people donating processing time to the projects.

I'll back off my contention that 90% penetration of the high end smartphone market of 2013 onwards could handle the BOINC projects needs as they are about equivalent to 2004-06 x84/x64 GFLOPS performance.
26) Questions and Answers : Windows : Visual Fortran Run-Time Error (Message 51835)
Posted 14 Apr 2015 by marmot
Post:
@Les Bayliss
The error there, is
REPLANCA :I/O ERROR
which is a data mismatch between files.

So, in that particular case, yes it's a problem with the model.



I'm curious, how did you find the specific error code? Not finding specifics on the Workunit 9760129 page.

-----

@Jim

The Programmers will have to be careful not to make the programs so finicky that they cannot be run successfully on an average home computer or they will no longer be suitable as a Boinc project. Then they will be back to trying the raise the money to rent supercomputer time.


Agreed.

The BOINC network is a globally distributed, heterogeneous supercomputer that currently has only, like, 0.0015% of the available computing power tapped by BOINC clients.
With fault tolerant coding, in smaller chunks, the smartphone computing power might be enough to meet all ClimatePrediction.net 's computing needs with computing power to spare.

BOINC market penetration on the desktop, tablet, laptop and smart phone needs to increase. A marketing campaign is needed to make BOINC cool and one of the top d/led apps.

I guess there's enough computing power out there so that clients should be competing for WU, and many just sitting idle, because the servers for all BOINC projects can't get work out fast enough.
27) Questions and Answers : Windows : Visual Fortran Run-Time Error (Message 51822)
Posted 12 Apr 2015 by marmot
Post:
Probably because no other project uses programs that are close to a million lines of source code, or are so complex in what they do.
Add to this the auxiliary files, such as the new, more detailed analysis of the latest version of MOSES + Triffid, and you have a super computer program that doesn't tolerate desktop/laptop computers that aren't "just so".

And, of course, there's also the failures due to planetary physics.

Plus, sometimes there's a data file error, when one of the junior researchers
mismatches what's in the data strings in several of the files.



There seems to be a problem with the WU itself.
4 different machines have failed it with an error while computing

Since it's approaching it's 5th and last failure I'm not going to worry any more about it.
If I see any more 'error while computing' it's good to be able to track the WU history.
28) Questions and Answers : Windows : Visual Fortran Run-Time Error (Message 51808)
Posted 11 Apr 2015 by marmot
Post:
The message does not appear in service mode. However, most of the applications don't work in service mode, so that's a solution that's not a solution. So the real solution is to find what causes the message: in one instance for me it was a berserk printer driver that was running at 100% CPU - so the cause could be practically anything. Which isn't a solution either. So this topic continues.


I haven't seen the FORTRAN run time errors and needy GUI input messages for weeks but this machine managed to complete 1 WU but another failed. 3 are running and are 130 to 165 hours in. The machine has been up for days maybe weeks and this browser instance has 40-50 tabs open without issues. Ran a game a few days back, played videos, have the print spooler shut down along with various unnecessary other services.

I'm trying to say that the machine is working nicely and no other projects seem to be failing WU so
... why the computing errors?
29) Questions and Answers : Windows : Visual Fortran Run-Time Error (Message 51660)
Posted 20 Mar 2015 by marmot
Post:
The last 7 WU's on computer http://climateapps2.oerc.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=1347460 have thrown off run-time errors. The client will show the project advancing but the task-manager shows no CPU usage. Finally, after responding to the error dialogs the WU ends in computation error.

Should I reset the project?

Do you need more information about the errors submitted?
30) Message boards : Number crunching : Credits. (Message 51210)
Posted 14 Jan 2015 by marmot
Post:

"There are dark places I could point you to on the web where people talk about such matters, but then I'd probably have to kill you. It needs a degree in higher mathematics, an extraordinarily thick skin, and an endless supply of cool, damp towels to wrap around your fevered brow."


Are B.S. in physics and mathematics not high enough to peruse those dark places?
31) Message boards : Number crunching : Credits. (Message 51020)
Posted 22 Dec 2014 by marmot
Post:
I've opened up a can of worms and am choking on it now!

This appears http://boinc.berkeley.edu/trac/wiki/ClientSchedOctTen to be the controlling document on credit granting for the BOINC client.
There have been a few other proposals including:
http://boinc.berkeley.edu/trac/wiki/CreditNew
http://boinc.berkeley.edu/trac/wiki/CreditProposal
and some Fixed credit/reference-machine from Dr Anderson.

Supposedly SETI has moved to the CreditNew proposal.

There was a brief, heated debate in a thread about "Utopia Bitcoin not science project" http://boinc.berkeley.edu/dev/forum_thread.php?id=9473#54892 with people that have 400 cores calculating. Points were made on credit inflation from the ASIC's, how FLOP's benchmarks not being suitable for credit calculations and how Utopia seems to be essentially awarding BOINC credit
for money donated at the BOINC forums. I'm guessing that conversation was had at other cross platform and team forums.
With the prospect of users with huge number of cores looking to leave BOINC (likely to Stanford?) it seems a new generalized credit proposal was made:
http://boinc.berkeley.edu/trac/wiki/CreditGeneralized

BOINC projects seem to be a bit like herding cats, so it will be interesting to see how many projects adopt some form of the new proposal.

EDIT:
This chart shows what the controversy is about.
Notice the inflation from July 2014 which, I assume, is stemming from Utopia.



Yes, certainly Utopia:
32) Message boards : Number crunching : Credits. (Message 51018)
Posted 22 Dec 2014 by marmot
Post:

So, it's your computer(s) that have the problem.
As you're talking about RAC, which is very ephemeral, perhaps you need to think differently on this project.
Each model type (i.e. the name of the 1st part of a model name), takes a different amount of time to run through a "unit of time". A good one to use is a model day, as this is what gets run over and over.
So, as your computer changes model type, the amount of time being spent slaving away to earn one credit will keep changing. This will then be reflected in the RAC. Which makes RAC a fairly useless measure of anything on this project.
Best to check the Granted credit column for each model a day or two after it's finished, and see if it now has credit.

The i5 did toss a single WU out in error after 59,000 sec (irritating) but the rest of the WU from 2014, that were worked on by three machines, completed successfully and are granted. (The t7500 returned errors on every 2011 packet. Really should have checked in and figured THAT out, sorry.)
The i5-2430m is getting ~2350 credit a day on this project (calculated by granted credit/CPU time across 4 threads). Under Asteroids it was getting 6900 and the results for the t9300 (received 3800 per day and 6500 on Aster@H) and m620 are similarly lower than the Asteroids credit so there is a 0.3 to 0.6 reduction in credits received compared to Asteroids. Will need many more data points to get a better average figure.
These machines are heating the bedroom to 70 degrees and I chose the projects on importance to survival of the species. But a mystery is a mystery and credits are a matter of pride and competition amongst many users so my curiosity is satisfied.

EDIT: So CPDN does follow the cobblestone credit related in the Wiki (http://boinc.berkeley.edu/wiki/Computation_credit) in it's trickle calculations?
33) Message boards : Number crunching : Credits. (Message 51005)
Posted 21 Dec 2014 by marmot
Post:
I was starting to hypothesis my equipment is too old to run ClimatePrediction because the credit received was so small per day compared to other projects.

Under my account it lists my RAC as 1,398 on the equivalent of 6 2nd gen i5 CPU's.
Asteroids@home gets 10,000 RAC on those same 6 CPU's.

EDIT: OK, I see that I have pending credit not applied as far back as Dec 9th so it's going to effect the RAC even on my account page.

I'm still curious if credit received from ClimatePrediction is equivalent to Asteroids.

EDIT 2: OK, the pending credit packets identify as claimed then granted and so seem to be in the RAC calculation. I'm baffled now.

BTW, thanks for answering Les, I notice how much time you spend answering questions on here.
34) Message boards : Number crunching : Credits. (Message 51003)
Posted 21 Dec 2014 by marmot
Post:
Asteroids@home is currently moving their servers and the 6 laptops I have running were getting between 23,000 and 40,000 credits a day between Asteroids/PrimeGrid/SETI/Collatz/MOO!.
With Asteroids down SETI (resource 1000) and ClimatePrediction (resource 500) stepped up and now my RAC is between 2000 and 9000. ClimatePrediction is getting 1/2 of the CPU's available with those settings (no GPU).

That's a serious drop off. Event logs show no computation errors and I see that the credit scripts are getting run at least every other day so why's the credit 1/3 what Asteroids was getting?


Previous 20

©2024 climateprediction.net