climateprediction.net home page
Posts by Niall

Posts by Niall

21) Message boards : Number crunching : UK Met Office HadCM3 Short (Message 50077)
Posted 8 Sep 2014 by Niall
Post:
Briefly wondering why this has never been an issue for me till I realised that it doesn't affect those of us who will only be running one project at a time.


I'm only running CPDN while I have CPDN work, without interruptions. I crashed 8 hadcm3s units (and successfully ran none) before giving up.

Two hadam3p_anz units, a hadam3p_pnw unit and a hadcm3n unit are all currently running normally.

I don't think that's where the problem lies.
22) Message boards : Number crunching : Compute Errors on HadCM3 short Tasks (Message 49856)
Posted 25 Aug 2014 by Niall
Post:
I have now crashed 8 of these units, with no successful completions.

Since these seemed to be glitching for others I paused other work and let them run for a bit to see if they were stable. All crashed after about 20 min. One hadcm3n unit, one hadam3p_eu unit, and two _pnw units are running normally.

Win7, 64 bit, BOINC release 7.2.42.

It appears some are getting better results than others, so I'm leaving these units for those. I have unchecked the box next to these units in my CPDN user preferences.
23) Message boards : Number crunching : Upload problems (Message 49772)
Posted 19 Aug 2014 by Niall
Post:
To clarify: this is not me doing this work. It is me having an appreciation of the people who do.
24) Message boards : Number crunching : Upload problems (Message 49770)
Posted 18 Aug 2014 by Niall
Post:
Miguel

Patience is sometimes key around here. If you are worried about a couple of hundred Mb on your hard drive, you probably need to see this in the context of the sheer amount of work getting done here (and how much of your hard drive it's taking up already).

A CPDN model, as has been discussed elsewhere around here, is in the million lines of Fortran range, and will be taking up Gigabytes of hard drive space. Between us we're at the tail end of crunching 80,000 hadam3p_eu work units. Each of those (assuming they don't crash/get aborted by the user, in which case they will be reassigned to another user) will upload 13 zip files, at around 34Mb each. A quick back of an envelope calculation tells me that means the servers will be getting back in the range of (13x34x80,000)/1024= 34531 Gigabytes of data, and change, just from those hadam3p_eu work units.

Then there are the Pacific North-West (not sure how many of those are being worked on at the moment, but it looks like it's into 5 figures) and ANZ attribution projects and two different coupled model projects (over 11,000 work units ready to be sent plus more on computers just from the geoengineering programme).

What were are probably talking about is a team of ace programmers and techs filling a terabyte drive on a more-or-less daily basis, while keeping the whole lot running.

To my mind, the best thing to do is to report glitches and wait patiently while they fix them. Servers are going to overload and crash, glitches are going to happen, and the credit script might not run reliably.

Under the circumstances, it's a testament to the competence and hard work of these people that we don't have more problems.
25) Message boards : Number crunching : 0 credit for WU (Message 49751)
Posted 17 Aug 2014 by Niall
Post:
The credit script is a bit shaky around here. It often doesn't run, and then you get a glut of backdated credit when they get it working. It's a lower priority than keeping the other servers running, and the past ten days have been pretty bad for the tech people. Four servers were down for most of last week. Two, including a vital one for handling uploaded data from hadam3p_eu units (at minimum), are currently down.

The credit will come through, but it may take time.

I wouldn't lose sleep over it.
26) Message boards : Number crunching : ANOTHER UPLOAD PROBLEM (Message 49741)
Posted 16 Aug 2014 by Niall
Post:
Agreed. This time it's the 13.zips that seem to be moving, while it appears everything else is stuck.
27) Message boards : Number crunching : ANOTHER UPLOAD PROBLEM (Message 49731)
Posted 15 Aug 2014 by Niall
Post:
Indeed. I had a similar number. I now have a couple of zips showing a transient error, but those invariably clear, so I'm not worried. The server shows the number of WUs in progress dropping very fast.

From something posted on the BOINC forum, Richard Haselgrove (and team?) deserve kudos for a lot of hard work getting the servers back online. Nice work. Hope you get time for a beer or several.
28) Message boards : climateprediction.net Science : Climate change in the News (Message 49730)
Posted 15 Aug 2014 by Niall
Post:
I was reading this article in the (London) Guardian the other day:
http://www.theguardian.com/environment/2014/aug/11/extreme-weather-common-blocking-patterns

It talks about how blocking patterns in the jet stream are linked to extreme weather events, such as the UK's recent winter flooding, the drought in the western US and the heat waves that hit Russia in 2010 and Europe in 2003.

This is relevant to our crunching operations:

"[Dr Dim] Coumou, [at the Potsdam Institute for Climate Impact Research] acknowledges his study shows a correlation � not causation � between more frequent summer blocking patterns and Arctic warming. �To show causality, computer modelling studies are needed, but it is questionable how well current climate models can capture these effects,� he said."

"Prof Tim Palmer, at the University of Oxford, wrote in a PNAS article in 2013 that understanding changes to blocking patterns may well be the key to understanding changes in extreme weather, and therefore to understanding the worst impacts of climate change on society. But he said climate models might have to run down to scales of 1km to do so. �Currently, national climate institutes do not have the high-performance computing capability to simulate climate with 20km resolution, let alone 1km,�"


This actually goes back to something I asked a few months ago.

HadAM3 has a horizontal resolution of 3.75 � 2.5 degrees in longitude � latitude (or about 300km), which means that to reach Professor Palmer's requirements, we would need a model with roughly 300x300=90,000 times finer resolution (unless I'm even worse at maths than I thought, which is always possible), which is absolutely non-trivial. That kind of model just couldn't be run on a domestic system. I doubt it could be run on one of the Met Office's supercomputers.

That said, if the models are not properly showing the blocking patterns, just how reliable are the attribution studies? If Dr Coumou and Professor Palmer are correct (and I don't know enough to even think about debating with them) it looks to me as if these extreme events are considerably more likely than would be suggested by anything a HadAM3 ensemble is capable of picking up.

Or am I wrong? Tell me I'm wrong...
29) Message boards : Number crunching : ANOTHER UPLOAD PROBLEM (Message 49716)
Posted 6 Aug 2014 by Niall
Post:
Okay. It looks fixed. The completed units are off my system, and they're showing as complete in my account (the credit numbers look wrong, but I'm really not bothered about those).

Thanks
30) Message boards : climateprediction.net Science : New and updated pages on the climateprediction.net website (Message 49709)
Posted 5 Aug 2014 by Niall
Post:
In general, I like what you've done here. It's the kind of nice simple explanation I can cite when I squabble with deniers, as well as referring to when I'm trying to explain things to the less educated on the subject.

I agree with Tullus, however. Levels of knowledge vary, but it's always good to have the references.

--- "For state-of-the-art General Circulation Models/Global Climate Models (GCMs) such as the one used in the climateprediction.net experiment, it is more a case of trying to represent everything, even if things then get so complicated that we can�t always understand what�s going on."

Uh uh. I think this is badly worded. I've been in too many arguments with deliberately obtuse deniers. I understand what you mean here. The deniers will deliberately misunderstand. They will try to make it look like you mean that you're just guessing. I suggest rewording this (if necessary, you can then mod this comment, because I don't want them twisting what I said to make it mean what they want it to mean (coverup), and not what I meant (clarification)).

How about something like, "Often our analysis of complicated results challenges our theories and gives us a better understanding of our modelling of the climate"? I know you are intelligent scientists, but the deniers are always looking for evidence otherwise, even if it's a misleading statement taken out of context.

--- Under resources, the list could get very long, but I'm a heavy user of Skepticalscience, somewhat less so of Carbonbrief, although it's still a valuable resource. Just some thoughts.

--- There is something I would like to see.

We're using our computing resources (and electricity budget) to assist in climate modelling, and I'm happy to do so. Sometimes I feel a little taken for granted, in the sense of doing this in return for what are basically worthless bead-tokens (BOINC credits). I'd like to know what that computer time was used for. Perhaps one way to do this would be a post on the science forum by one of the scientists involved in most projects to let us know what you're doing with the data.

Some things to think about (I don't want to make this too restrictive):
* What model did you use?
* How many model runs did you use?
* were these new runs, or data from last year you found a new use for?
* What were you investigating?
* What did you find?
* Why does it matter? What policy might it inform?
* Is there a peer-reviewed paper coming out? Where? Will it be open access?


I did a quick back-of-an-envelope calculation on the 80,000-ish runs we've been chewing our way through lately. Assuming they all complete first time, at the rate I'm getting through them, that's about 900 processor-years worth of crunching between us. This had better be worth it!

The way you kept us up to date on the Weather@home UK flooding attribution was very much appreciated; the silence on the ANZ attribution project somewhat less so. Did I miss something? I liked this blog by Dr Stott: http://www.carbonbrief.org/blog/2014/02/what-climate-change-attribution-can-tell-us-about-extreme-weather-and-the-recent-uk-floods/

I don't expect a briefing on everything. First, I know those scientists are busy. They need to be doing science. Equally, with civilisation and entire ecosystems at stake, someone needs to be communicating it. Second, sometimes an experiment may not show any effect (which may mean that, for example, particular events were not the result of climate change or that the models are not sensitive enough to show it, either of which may give aid and comfort to the deniers, even where other factors may have been involved: not everything can be readily shown to be down to climate change).

As I'm sure you're aware, debates with deniers are ongoing on blogs, forums and newspaper letter pages all over the world in an effort to transform public opinion. I will help you make bullets: if you give me bullets back, I will help fire them. If I can tell deniers that climate change increased the probability of severe flooding by n% at the 95% CI, this is useful.
31) Message boards : Number crunching : ANOTHER UPLOAD PROBLEM (Message 49707)
Posted 5 Aug 2014 by Niall
Post:
There is definitely still an upload/reporting problem here. I have three completed work units sitting on my system. I have no idea whether this is connected to my previous report of events not showing up in the event log:
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/forum_thread.php?id=7857

The following work units have completed over the past several days:
hadam3p_eu_o4kg_2013_1_008832505
hadam3p_eu_lbq3_2013_1_008826420
hadam3p_eu_l7u5_2013_1_008821382

They all sent the final zip file to the server:
01-Aug-2014 17:11:36 [climateprediction.net] Finished upload of hadam3p_eu_o4kg_2013_1_008832505_0_13.zip
04-Aug-2014 14:13:36 [climateprediction.net] Finished upload of hadam3p_eu_lbq3_2013_1_008826420_1_13.zip
05-Aug-2014 05:14:17 [climateprediction.net] Finished upload of hadam3p_eu_l7u5_2013_1_008821382_0_13.zip

BOINC is still trying to report them as complete:
05-Aug-2014 08:40:26 [climateprediction.net] Reporting 3 completed tasks

They are all still showing as "Ready to report" in BOINC manager.

None of them are logged as having completed on the server:
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=8978434
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=8972349
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/workunit.php?wuid=8967311

I'm not sure what else might be useful. WU 8978434 (o4kg) completed, as noted, around the time my computer crashed. The others logged a problem at around the same time, which I mentioned a few days ago.

Please advise.

Thanks
32) Message boards : Number crunching : ANOTHER UPLOAD PROBLEM (Message 49692)
Posted 2 Aug 2014 by Niall
Post:
Thanks. I will try that. I was wondering why the GPU never got any work. I tried a few things and gave up.

If you count the trickles, the Task 16751278 finished yesterday but, as you can see, it's still showing as in progress, presumably because it's not reporting as finished, which is what I was trying to explain.
33) Message boards : Number crunching : ANOTHER UPLOAD PROBLEM (Message 49690)
Posted 2 Aug 2014 by Niall
Post:
This is all I get:
02-Aug-2014 08:17:14 [climateprediction.net] Sending scheduler request: To report completed tasks.
02-Aug-2014 08:17:14 [climateprediction.net] Reporting 1 completed tasks
02-Aug-2014 08:17:14 [climateprediction.net] Requesting new tasks for intel_gpu
02-Aug-2014 08:17:18 [climateprediction.net] Scheduler request completed: got 0 new tasks
02-Aug-2014 09:17:56 [climateprediction.net] Sending scheduler request: To report completed tasks.
02-Aug-2014 09:17:56 [climateprediction.net] Reporting 1 completed tasks
02-Aug-2014 09:17:56 [climateprediction.net] Requesting new tasks for intel_gpu
02-Aug-2014 09:17:59 [climateprediction.net] Scheduler request completed: got 0 new tasks

The rest of the program is running normally - intermediate zips are uploading, news tasks downloading as instructed. It's just that one completed task not clearing properly.
34) Message boards : Number crunching : ANOTHER UPLOAD PROBLEM (Message 49688)
Posted 2 Aug 2014 by Niall
Post:
Nope. BOINC is still reporting the WU as finished to the CPDN server, but the unit is still sitting on my drive showing as "ready to report". It has been for about the past 20 hours. Not sure if this is the same bug or a different one.
35) Message boards : Number crunching : ANOTHER UPLOAD PROBLEM (Message 49686)
Posted 2 Aug 2014 by Niall
Post:
I've got a different problem, and I'm not sure if it's an upload problem or the result of a crash. Last night my system crashed. I don't know what caused it: it may have been BOINC or it may have been either of a couple of other things.

The sdoutdae.txt file logged the following:
"01-Aug-2014 17:13:06 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with zero status but no 'finished' file
01-Aug-2014 17:13:06 [climateprediction.net] If this happens repeatedly you may need to reset the project.
01-Aug-2014 17:13:07 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with a DLL initialization error.
01-Aug-2014 17:13:07 [climateprediction.net] If this happens repeatedly you may need to reboot your computer.
01-Aug-2014 17:13:08 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with a DLL initialization error.
01-Aug-2014 17:13:08 [climateprediction.net] If this happens repeatedly you may need to reboot your computer.
01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_lbq3_2013_1_008826420_1 exited with zero status but no 'finished' file
01-Aug-2014 17:13:10 [climateprediction.net] If this happens repeatedly you may need to reset the project.
01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_l7u5_2013_1_008821382_0 exited with zero status but no 'finished' file
01-Aug-2014 17:13:10 [climateprediction.net] If this happens repeatedly you may need to reset the project.
01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_l51h_2013_1_008817758_0 exited with zero status but no 'finished' file"

Then I get repeated messages that
01-Aug-2014 17:13:10 [climateprediction.net] Task hadam3p_eu_h62y_2013_1_008861707_1 exited with a DLL initialization error.
01-Aug-2014 17:13:10 [climateprediction.net] If this happens repeatedly you may need to reboot your computer.
01-Aug-2014 17:13:12 [climateprediction.net] Task hadam3p_eu_lbq3_2013_1_008826420_1 exited with a DLL initialization error.
01-Aug-2014 17:13:12 [climateprediction.net] If this happens repeatedly you may need to reboot your computer.
01-Aug-2014 17:13:12 [climateprediction.net] Task hadam3p_eu_l7u5_2013_1_008821382_0 exited with a DLL initialization error.
01-Aug-2014 17:13:12 [climateprediction.net] If this happens repeatedly you may need to reboot your computer.
01-Aug-2014 17:13:12 [climateprediction.net] Task hadam3p_eu_l51h_2013_1_008817758_0 exited with a DLL initialization error.

This happened repeatedly over the space of 25 seconds. Then I get what looks like BOINC's usual startup readout as the system reboots after the crash.

The work units themselves are now running normally, but this all occurred just a couple of minutes after unit hadam3p_eu_o4kg_2013_1_008832505_0 finished its run.
01-Aug-2014 17:11:36 [climateprediction.net] Finished upload of hadam3p_eu_o4kg_2013_1_008832505_0_13.zip

This is fine, but it's been reporting this, repeatedly, all night:
01-Aug-2014 22:25:30 [climateprediction.net] Sending scheduler request: To report completed tasks.
01-Aug-2014 22:25:30 [climateprediction.net] Reporting 1 completed tasks
...
02-Aug-2014 09:17:56 [climateprediction.net] Sending scheduler request: To report completed tasks.
02-Aug-2014 09:17:56 [climateprediction.net] Reporting 1 completed tasks

The remains of task hadam3p_eu_o4kg_2013_1_008832505_0 are still on my system, with a "ready to report" status more than 16 hours after completion.

Is this the result of the crash, or is this part of your server glitch? I seem to be uploading zips from the working WUs normally.
36) Message boards : Number crunching : Possible problem with new EU work units (Message 49589)
Posted 18 Jul 2014 by Niall
Post:
Sorry. Delete. My mistake this time.
37) Message boards : Number crunching : Possible problem with new EU work units (Message 49553)
Posted 13 Jul 2014 by Niall
Post:
I left the computer on overnight to see if this was a transient problem or a misconfiguration here.

WU ne8f apparently completed normally, and uploaded zips 12 and 13.
WU g6dk uploaded zips 6 and 7 as expected.

Oh, this is interesting. The whole thing may be a minor bug. I was going on the output from the BOINC manager event log, but I've had a look at the "stdoutdae.txt" file, which I wasn't aware of until Nigel Garvey mentioned it. The output from the two corresponds in all particulars, as far as I can see, except for two. The event log fails to mention two incidents that are in the "stdoutdae.txt" file:

12-Jul-2014 13:46:05 [climateprediction.net] Started upload of hadam3p_eu_g6dk_2013_1_008856209_0_5.zip
12-Jul-2014 13:52:26 [climateprediction.net] Finished upload of hadam3p_eu_g6dk_2013_1_008856209_0_5.zip

12-Jul-2014 19:29:29 [climateprediction.net] Started upload of hadam3p_eu_ne8f_2013_1_008812943_0_11.zip
12-Jul-2014 19:35:38 [climateprediction.net] Finished upload of hadam3p_eu_ne8f_2013_1_008812943_0_11.zip

The problem may lie with a minor error in my event log, and otherwise be a false alarm, in which case all I probably need to do is update BOINC.

I suppose the next question is, are those files on your server or not?
38) Message boards : Number crunching : Possible problem with new EU work units (Message 49550)
Posted 12 Jul 2014 by Niall
Post:
Welcome.

I have also noticed they have been crunching unusually fast. I've completed one of these already. Usually hadam3p_eu units take my system about 125 hours of crunching. That first one of the new batch (hadam3p_eu_i3gp_2013_1_008768925) took under 100 hours, and these two are on schedule to take about the same. It occurred to me the two phenomena might be connected.
39) Message boards : Number crunching : Possible problem with new EU work units (Message 49548)
Posted 12 Jul 2014 by Niall
Post:
By 4-character code, I assume you mean the alphanumeric code after the hadam3p_eu on the work unit name - hadam3p_eu_*xxxx*_2013_1_nnnnnnnnn_0.

These are ne8f and g6dk.
40) Message boards : Number crunching : Possible problem with new EU work units (Message 49546)
Posted 12 Jul 2014 by Niall
Post:
I'm currently crunching a couple of the latest batch of hadam3p-eu work units. Usually, these units send a dozen updates as zip files to the server at evenly spaced intervals (at roughly 8.3%, 16.7%, 25% complete, and so on).

Work unit 8958921 has failed to create zip number 11, now at 92.8% complete.
Work unit 9002138 has failed to create zip number 5, now at 47.7% complete.

Is this a problem?


Previous 20 · Next 20

©2024 climateprediction.net