climateprediction.net home page
Posts by old_user651284

Posts by old_user651284

1) Message boards : Number crunching : Credit updates? (Message 49497)
Posted 4 Jul 2014 by Profile old_user651284
Post:
Hi Chaps and Chapesses,

Last week our BOINC server suffered a corrupted hard drive and had to be restored from a backup. This meant that various fixes that had been put in place during the 'at-risk' period we have had for the last three weeks were undone - and one of these was a patch to allow the credit scripts to run correctly on our servers.

CPDN uses a custom credit generation script which I have spent today fixing.

The script is running, but given the age of CPDN and the large number of users, it takes several hours to complete.

I am hopeful that the script will complete successfully this time, which means that you should get your credits in a few hours (it is currently 5 pm BST).

Sorry for the inconvenience.

Jonathan
CPDN Sys-Admin
2) Message boards : Number crunching : Credit updates? (Message 49325)
Posted 10 Jun 2014 by Profile old_user651284
Post:
Hi,

The db_dump script won't run in daemon mode on our server, so I have to run it manually until I can recompile our boinc daemons.

I ran it manually yesterday.

The recompiling won't happen for a while, because we are in the middle of major work to improve the back end storage servers.

In the meantime, I will run the script manually whenever I get the chance, which won't be until tomorrow (Wed 11 June).

Jonathan Miller
CPDN Sys-Admin
3) Message boards : Number crunching : Credit updates? (Message 49257)
Posted 29 May 2014 by Profile old_user651284
Post:
Hello everyone,

I am the system administrator for CPDN, and it is my role to fix issues such as the one with which this thread is concerned.

I am aware that many of you have concerns about the lack of credit exports to sites such as http://boincstats.com/en/stats/2/project/detail

I have looked into this, and I think it is now fixed, which can be verified at the link above.

Those of you that follow the project closely probably know that we are in the process of improving the computing infrastructure behind CPDN. This involves moving our services to new servers and providing a new mechanism for the project scientists to release and process the work that our volunteers do for us.
This has coincided with two high-profile experiments, the UK Floods and Australian heatwave experiments, which have been occupying the time of the computing team (there are indeed only 2 of us).

I admit that I have considered the credit aggregator issue to be low priority until now, because we have been working on these other issues. I hope that soon you will all be able to see the results of the work that we have undertaken.

...but for now, at least this one issue has been fixed :-)






Jonathan Miller
CPDN System-Administrator
4) Message boards : Number crunching : NZ Application "not in DB" (Message 48694)
Posted 3 Apr 2014 by Profile old_user651284
Post:
The 'Not in DB' error was the result of our attempt to fix the missing font files in the Windows ANZ model

The error merely meant that the web page could not determine which model version number to display on the website - it did not affect the workunit in any way.

We have now fixed this error, so you should not be seeing it for any of your models as of 3 April 2014.

Jonathan

CPDN Sys-Admin
5) Message boards : Number crunching : ANZ model upload problems. (Message 48539)
Posted 26 Mar 2014 by Profile old_user651284
Post:
This appears to be an NFS file locking issue.
It currently affects about 1% of the files that have uploaded.

The solution would be to stop file locks on the NFS-mounted storage device on the ANZ server, but I am not yet sure of the implications this would have - I am guessing the effect would be minimal, but I am checking with the servers admin.

Jonathan

CPDN-sysadmin
6) Questions and Answers : Wish list : Regional Participation (Message 48522)
Posted 25 Mar 2014 by Profile old_user651284
Post:
Hi Tony,

We had actually thought about this problem, so the release of ANZ work is timed to co-incide with working hours in Australia/New Zealand. By releasing them after the working day here, we hope that will give all the interested parties in ANZ a fair chance to get the work that is released.

Keep an eye out for new work within a few hours :-)

Jonathan

CPDN Sys-admin
7) Message boards : climateprediction.net Science : Project Downtime 10 January (Message 47950)
Posted 8 Jan 2014 by Profile old_user651284
Post:
Hi, I have to take the project database off-line for the first of (hopefully only) two upgrades at the end of this week.
I will schedule the downtime from:

12 Noon GMT on 10 Jan
until
12 Noon on 13 Jan 2014.

There will be no database access during this downtime.

For those who are interested, this the first step towards retiring our old database server, and moving her functions over to the virtualised infrastructure at the Oxford e-Research Centre.

Jonathan Miller
CPDN System-Administrator
8) Message boards : climateprediction.net Science : ClimatePrediction.Net 10 year anniversary event: London 13 Sept 2013 (Message 47164)
Posted 25 Sep 2013 by Profile old_user651284
Post:
The slides for the talk at this event are here:

http://www.climateprediction.net/wp-content/uploads/2013/09/10years_cpdn_pub.pptx (14 MB)[/url]
9) Message boards : climateprediction.net Science : ClimatePrediction.Net 10 year anniversary event: London 13 Sept 2013 (Message 46963)
Posted 5 Sep 2013 by Profile old_user651284
Post:
CPDN's 10th anniversary is fast approaching!

Celebrations with the cpdn-team and the citizen scientists who make climateprediction.net an ongoing success will take place on Friday 13 September 2013, at the Royal Society in London. The first part of the day will be a workshop, which will then be followed by a drinks reception at 4 p.m.

You can find the details of the programme and venue at:
http://www.eresearchsouth.ac.uk/events/current-and-future-directions-of-citizen-science

Please register for the event at http://asp.artegis.com/citizen_science and join us on this occasion!

Myles

Professor Myles Allen, Principal Investigator, ClimatePrediction.net
10) Questions and Answers : Unix/Linux : Unable to verify using certificates (Message 46826)
Posted 21 Aug 2013 by Profile old_user651284
Post:
Hi,

Your post suggests that the executable files have the wrong signature.
We have had a look, and we don't think anything has changed at our end (well, as regards the file download mechanism anyway :-) ).


I have checked, and I can see that we are indeed supplying the same executable files that we always were, and we have certainly not changed our database records of the signatures of these files.

That suggests that the problem is with your client_state.xml file, that your BOINC client uses to check that it has donwnloaded the correct file.

If you could post the snippets of the appropriate <file> segments of the client_state.xml, then we could offer more advice.


<file>
<name>hadcm3n_um_6.07_i686-apple-darwin.zip</name>
<nbytes>2511269.000000</nbytes>
<max_nbytes>0.000000</max_nbytes>
<status>1</status>
<executable/>
<signature_required/>
<file_signature>
73cbabbaa42a28bc32cebde27296f9d0c86f0fe14e83765866dece57e4d65b42
27aee17f827a15678d9f040c088eb6b29cdddd73bd10e434b2a3c9203323b5d0
9d48614565f164a7d735a479b321b1e1d7f5b4f4fa11e3e83fc36f0446bd58a5
9c75b7768dfb49d32c37b4872cef77eec42428c6f2ee9e104a6f4e96101c4c0d
.
</file_signature>
<download_url>http://climateapps2.oucs.ox.ac.uk/cpdnboinc/download/mirror.php?file=/hadcm3n_um_6.07_i686-apple-darwin.zip</download_url>
</file>

Jonathan Miller
CPDN Sys-Admin
11) Message boards : Number crunching : News and Announcements (Message 46696)
Posted 24 Jul 2013 by Profile old_user651284
Post:
Planned downtime: 2 - 5 August 2013

The server room in which CPDN resides is due to be shut down for electrical
testing on 2 August 2013. The testing will take place over the weekend, and CPDN servers will be brought back online on 5 August 2013 (assuming everything goes to plan).

There will be NO CPDN service during this time from any Oxford machines:

ClimatePrediction.net
Climateapps2.oerc.ox.ac.uk [this forum]
cpdn-upload2.oerc.ox.ac.uk
charybdis.oerc.ox.ac.uk
cpdntrickle.oerc.ox.ac.uk
uploader.oerc.ox.ac.uk
uploader1.atm.ox.ac.uk
cpdnbeta.oerc.ox.ac.uk
trillionthtonne.org
seacourt.oerc.ox.ac.uk
glaaki.oerc.ox.ac.uk
kraken.oerc.ox.ac.uk
wyrm.oerc.ox.ac.uk

12) Message boards : Number crunching : UPDATE???? (Message 45345)
Posted 17 Dec 2012 by Profile old_user651284
Post:
Hi,
We are working to bring the update of CPDN into being in March or thereabouts.

This is a very big undertaking, and we are having to get more staff in to help us with the process.
The first stage, which is what we are currently working on, is to establish how our current setup differs from the 'Vanilla' BOINC. It seems CPDN is definitely 'Neapolitan' with some parts being 'Stilton' flavoured icecream.

As for the lack of updates, yes working for Oxford University is every bit as bad as working for the NHS.

Jovada, you are right, but that does seem to be how our managers here would like us to operate.

There is only so much that those of us at the bottom can do to improve the project when our managers are inert. Morale is pretty low.

Jonathan
CPDN Sys-admin (at least for the moment)
13) Message boards : Number crunching : ODD ERROR (Message 45091)
Posted 15 Oct 2012 by Profile old_user651284
Post:
As far as I can see, this is happening because the server is having difficulty writing to the storage partition. The partition appears to be responding very slowly.

I am talking to our IT chaps about what can be done to overcome this.
14) Message boards : Number crunching : Download error pnw "Can't resolve hostname" (Message 45028)
Posted 5 Oct 2012 by Profile old_user651284
Post:
Hi,

That is the result of a typo by me when I swapped the roles of two servers.
The URL should have been http://cpdn-downloads.oerc.ox.ac.uk....."

While investigating I also found another problem and fixed it (I hope), so this should work now.

Thanks for the heads up.

Jonathan
15) Message boards : Number crunching : Uploads not working (Message 44906)
Posted 26 Sep 2012 by Profile old_user651284
Post:
Hi,

We have issues on all three of our storage servers at the moment.

Currently Uploader1.atm is full, and the two machines who would normally receive her excess files are suffering from disk issues.

cpdn-upload2.oerc is one of the machines above, so she cannot currently receive uploads.

We are waiting on a fix - I suspect it is to do with the network outage that OeRC suffered yesterday afternoon (2 - 4 pm BST, 25 Sept 1012).

16) Questions and Answers : Macintosh : New to BOINC trouble getting work downloaded (Message 44709)
Posted 15 Aug 2012 by Profile old_user651284
Post:
Hi,
We have one server down at present: Uploader1.atm.ox.ac.uk.

We are trying to move data off her to other servers, but it is a bit like playing musical chairs - everything is OK until the music stops.

She will be back up and running in a few days - we have to move several TB of data to a new home.

Jonathan
17) Message boards : Number crunching : News and Announcements (Message 44494)
Posted 2 Jul 2012 by Profile old_user651284
Post:
Hi,

Outage 30 June - 2 July.

The above loss of service was caused by a network problem with our database server.

The issue has been resolved.

Apologies for the inconvenience this has caused.

Jonathan
18) Message boards : Number crunching : Upload Failure (Message 44465)
Posted 26 Jun 2012 by Profile old_user651284
Post:
We are no longer allowed to use the DNS names containing .OUCS
We have moved over to using .OERC.

We were graciously allowed to use the domain name for 9 months during the transition period. My requests for a transition period of a year were vetoed by managers in my department, for their own inexplicable reasons.

If you are trying to upload to climateapps1.oucs.ox.ac.uk then you now need

cpdn-restarts.oerc.ox.ac.uk

129.67.195.121

If you could convince your machine (perhaps through /etc/hosts on a linux machine) to use this address instead, then you can upload it.

Otherwise, I am afraid you will have to trash it.

Jonathan
19) Message boards : Number crunching : Credit not synchronized on stats-pages (Message 44439)
Posted 20 Jun 2012 by Profile old_user651284
Post:
Hi Chaps,

I am the sysadmin for CPDN. I am not a BOINC expert, and I do not run BOINC on any of my systems.

CPDN uses a custom credit system that calculates credit based upon your total participation in the project, rather than simply appending your recent activity to a running total.

This means that if I try to fix the credit system and get it wrong, I can drastically affect the reported credit of every user who has ever used the project.
Indeed, my first task when I joined the project was to run an existing credit rationalisation script that brought the project down. That had the result of losing credit for some users. I spent a long time determining who had lost data, and stored the information for when I get the chance to fix the problem.

In investigating this problem, it would help me greatly if you could take me through the issues as you see them.

Please appreciate though, that we have 10 years of user credit data in our database (not a very good design, I agree, but we have to work from where we find ourselves).

I cannot 'just run it through Excel' because we have billions of records, and the exported user data is many 10s of GB in size. Excel is not up to the job.

As Iain said, I need your help to identify what is going wrong. I need you to point me to clear problems, and I will then see if I can understand what is going wrong.

Bear in mind that I am not a BOINC user, so I need this spelt out in words of one syllable (I am a bear of very little brain, who has a piece of fluff in his ear)



Jonathan
20) Message boards : Number crunching : Upload Failure (Message 44411)
Posted 15 Jun 2012 by Profile old_user651284
Post:
EDIT: This issue has now been resolved, and problems encountered as of 10.00 am BST on 18 June 2012 are likely to be due to the servers struggling to catch up with a weekend of uploads.

Sorry everyone, but the data centre in which we host our servers has had a mystery network problem since Thursday 14 June 15:45 GMT.

The problem manifests as intermittent loss of network within the data centre, which means that the project servers cannot communicate with oneanother.

The error below is because the upload server you are trying to use cannot see the network file storage that it want to write your result to.

I have taken cpdn-upload2.oerc and cpdn-restarts.oerc offine until this is resolved.

I am dependent upon other people to look at this issue, and it is now gone 4 pm on the last day of University term, so I don't hold out any hope at all of a fix until Monday.

Sorry.

Jonathan


Next 20

©2024 climateprediction.net