climateprediction.net home page
Posts by Thyme Lawn

Posts by Thyme Lawn

41) Questions and Answers : Windows : CPDN gone from Project tab after crash (Message 51648)
Posted 17 Mar 2015 by Profile Thyme Lawn
Post:
Unless you have a backup made before the crash there is nothing you can do to recover the lost WU�s. Try reattaching to the missing projects and see if they send new work. Good luck.

If you do try reattaching you might need to stop BOINC and delete the corrupt account files first. You'll need to reattach as an existing user, using the email address shown on the Change email address page.

Alternatively it's fairly easy to regenerate the account files manually. First you'll need to stop BOINC and obtain the following things to rebuild the file:

  • the project's master URL.
  • your "Account key" from the Your account page.
  • the project name (this might not be essential as it should be retrieved when you restart BOINC).



For CPDN the account_climateprediction.net.xml file (edited with a plain text editor) would contain the following (replacing the "XXXX" with your account key):

<account>
    <master_url>http://climateprediction.net/</master_url>
    <authenticator>XXXX</authenticator>
    <project_name>climateprediction.net</project_name>
</account>

When you restart BOINC you should automatically reattach to the project.

Either way you'll retain your ID and your credit and RAC are safe, but the empty statistics file does mean that the historical statistics shown in BOINC Manager will have to start over from your current values.
42) Message boards : Number crunching : Project Resource Share ? (Message 51618)
Posted 13 Mar 2015 by Profile Thyme Lawn
Post:
I've raised your problem with the project team STE\/E.
43) Questions and Answers : Unix/Linux : models not avail for Linux - AMD x86_64 or Intel EM64T CPU (Message 51615)
Posted 13 Mar 2015 by Profile Thyme Lawn
Post:
Any ideas about the cause of this message?

The project team need to get Linux only MOSES II tasks returned as soon as possible. The UK MET Office HadAM3P-HadRM3P Europe application has temporarily been restricted to Windows only in an attempt to achieve this.

The applications page shows the current model availability for each platform and the server status page shows what work is ready to be sent (I've asked the project team to change the hadam3p_eu label to make it clear that it's currently Windows only).
44) Questions and Answers : Unix/Linux : Not accepting requests. (Message 51607)
Posted 11 Mar 2015 by Profile Thyme Lawn
Post:
I've asked the project team to re-enable work for your host #1332020.

Done
45) Questions and Answers : Unix/Linux : Not accepting requests. (Message 51606)
Posted 11 Mar 2015 by Profile Thyme Lawn
Post:
Hi Richard,

I've asked the project team to re-enable work for your host #1332020.

Looking through your list of computers I see that host #1323410 is also missing the 32bit libraries. Work fetch is still enabled for that computer, but you'll need to install the libraries there too.
46) Message boards : Number crunching : hadRM3P Europe 7.26 No Trickles No Credit (Message 51600)
Posted 10 Mar 2015 by Profile Thyme Lawn
Post:
Jonathan has restarted trickle processing.
47) Message boards : Number crunching : hadRM3P Europe 7.26 No Trickles No Credit (Message 51595)
Posted 10 Mar 2015 by Profile Thyme Lawn
Post:
It looks like trickle processing hasn't been working since early on 5th March (one of MartinNZ's tasks has a processed trickle timed at 05 Mar 2015 00:25:25 and my first missing trickle after that was returned at 05 Mar 2015 02:04:49). It's possibly related to last week's download server failure.

I've notified the project team.
48) Message boards : Number crunching : News and Announcements (Message 51335)
Posted 27 Jan 2015 by Profile Thyme Lawn
Post:
The project will have some scheduled downtime on Thursday and Friday.

The scheduled downtime has been pushed back a week and will now be on Thursday 5th and Friday 6th February.
49) Message boards : Number crunching : News and Announcements (Message 51333)
Posted 27 Jan 2015 by Profile Thyme Lawn
Post:
The project will have some scheduled downtime on Thursday and Friday.

This is to allow the underlying hardware to be configured to accept a tape backup system as part of the 'near-line' storage.

Jonathan will be taking the opportunity to move the database backup to a different server, to give more resilience in case of hardware failures.

The downtime ought to be no more than a few hours on Thursday, but Jonathan does acknowledge that he's said that many times before!

All of the virtualised servers will be offline briefly:

  • climateprediction.net
  • trillionthtonne.net
  • climateapps2.oerc.ox.ac.uk
  • cpdn-upload2.oerc.ox.ac.uk
  • cpdn-results2.oerc.ox.ac.uk
  • database server

50) Questions and Answers : Windows : Multiple download errors (Message 51316)
Posted 26 Jan 2015 by Profile Thyme Lawn
Post:
What happens if you try to download the files using your browser?

The URLs are:

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/download/mirror.php?file=/hadam3p_afr_7.22_windows_intelx86.exe

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/download/mirror.php?file=/hadam3p_pnw_7.22_windows_intelx86.exe

http://climateapps2.oerc.ox.ac.uk/cpdnboinc/download/mirror.php?file=/hadcm3s_7.24_windows_intelx86.exe

If that works you should be able to move the file to <BOINC data directory>\projects\climateprediction.net

If you're running BOINC as a service you should check that the files have the right access permissions ("Full control" for the "boinc_admins" and "boinc_projects" groups and "Read & execute" for the "boinc_users" group). They'll probably inherit the right permissions from the directory, but I always check just in case.
51) Questions and Answers : Preferences : Tasks checkpoint to disk at most every [ ] Seconds. (Message 50575)
Posted 21 Oct 2014 by Profile Thyme Lawn
Post:
All of the CPDN applications perform their checkpoints at fixed points (every model day for HadAM3P and every 6 model days for HadCM3), so the setting has no effect on the project. Some other projects have similar restrictions either across the board or with specific applications (e.g. WCG's CEP2 and FAHV applications have fixed checkpoints but the other applications use the preference setting).

Some projects have applications which write a lot of data during checkpoints and I remember one case where a user was finding no progress being made on a slow computer because he'd set the checkpoint interval shorter than the time taken to write the checkpoint data!

I'm not sure what the default checkpoint interval is (possibly 60 seconds), but 600 seconds works well for me.
52) Message boards : Number crunching : Weird Clouds on hadam3p_anz (Message 50450)
Posted 9 Oct 2014 by Profile Thyme Lawn
Post:
I've checked the log for the original downloads for one of my Linux machines, and there's no download for globe.jpg
So globe, plus the jpg for the view of a specific model type, must be bundled up with the graphics program for each model type.

The global and regional jpg files are in the application's "se" zip file (hadam3p_pnw_se_7.22_windows_intelx86.zip for PNW v7.22 on Windows).
53) Message boards : Number crunching : hadam3p_pnw task not making progress (Message 50413)
Posted 8 Oct 2014 by Profile Thyme Lawn
Post:
hadam3p_pnw_w1i6_2006_1_009087092_1 has also been aborted after failing to make progress.
54) Message boards : Number crunching : hadam3p_pnw task not making progress (Message 50407)
Posted 8 Oct 2014 by Profile Thyme Lawn
Post:
I have just (on the advice of the project team) aborted a task from the new batch of WAH PNW tasks (hadam3p_pnw_w1wr_2006_1_009087617_0).

After 35 minutes it was still stuck at 0% with no checkpoints made, with less than 1 second CPU time for the worker processes. Going by the contents of the task's datain, dataout and jobs directories it had been set up properly.

I restarted BOINC and the task still wasn't showing any significant CPU time after 25 minutes (1.765 seconds for the controller process, 0.203 seconds for the global worker and 0.140 seconds for the regional worker).

Andy said this task was from batch 82 which will have task names starting with hadam3p_pnw_w0ny_ through to hadam3p_pnw_w20j_. If anyone experiences the same problem with a task from this batch you should abort it and report the problem here.
55) Message boards : Number crunching : ANOTHER UPLOAD PROBLEM (Message 50241)
Posted 20 Sep 2014 by Profile Thyme Lawn
Post:
And Yes Dave, although we do lead the world, it's not by that much ;-) Twas a typo, should have been 18th.

I hate to be pedantic Martin, but as far as time goes Kiribati's Line Islands lead the world at UTC +14:00 :)

Kiribati has 3 island groups in different timezones. More that 90% of the population lives on the Gilbert Islands which has always been UTC +12:00. When Kiribati gained its independence from the UK in 1979 the USA relinquished their claims on the Phoenix Islands and Line Islands, with the formal transfer happening 4 years later. Phoenix Islands (UTC -11:00) and Line Islands (UTC -10:00) were a day behind until the International Date Line was redrawn on 1st January 1995 to remove the anomaly.
56) Questions and Answers : Macintosh : I think my Account and/or computers are blocked - how do I get unlocked (Message 50220)
Posted 17 Sep 2014 by Profile Thyme Lawn
Post:
Your computer should be unblocked now ewulf.
57) Questions and Answers : Macintosh : I think my Account and/or computers are blocked - how do I get unlocked (Message 50216)
Posted 16 Sep 2014 by Profile Thyme Lawn
Post:
Yes, that's the same issue ewulf. Your computer is listed as running the broken BOINC version. I'll ask the project team to unblock it.
58) Message boards : Number crunching : ANOTHER UPLOAD PROBLEM (Message 50206)
Posted 16 Sep 2014 by Profile Thyme Lawn
Post:
uploader.oerc is down. Is that why I have 16 files ready for transfer/uploading and yet are backed off?

Doubtful. All of my pending WAH ANZ uploads will be sent to cpdn-upload4.oerc.ox.ac.uk (which isn't listed on the server status page, but nslookup tells me it's an alias for
rwah0.rdsi.tpac.org.au). The system does respond to ping and traceroute (26 hops with an average elapsed time of 390ms) and a browser test of the upload URL eventually gives the expected response:
<data_server_reply>
<status>1</status>
<message>no command</message>
</data_server_reply>

This suggests the system is working but something is causing upload attempts to time out before anything has been sent.
59) Message boards : Number crunching : 100 000 years of computing time (Message 50186)
Posted 15 Sep 2014 by Profile Thyme Lawn
Post:
My CPDN account claims equivalence, for me, "HadSM3 Model-Years 244,833.32". (Could that be ~2.5 times WCG's expected contribution? Hardly seems worth WCG's effort!)

I'm sure WCG are referring to CPU time rather than modelled time.
60) Message boards : Number crunching : ANOTHER UPLOAD PROBLEM (Message 49738)
Posted 15 Aug 2014 by Profile Thyme Lawn
Post:
On that subject, I see that the problem of the _13.zip file uploads failing with the error "No such file or directory" has already been passed directly into the lab by one of the other messengers.

My _13.zip uploads started working at around 0930 UTC.


Previous 20 · Next 20

©2024 climateprediction.net