climateprediction.net home page
Posts by MikeMarsUK

Posts by MikeMarsUK

21) Message boards : Number crunching : FIX FOR BAD TIME REMAINING ESTIMATES? (Message 47642)
Posted 25 Nov 2013 by Profile MikeMarsUK
Post:
I did abort it. I suspect it could be a result of some sort of corruption on my computer: ...


Usually it is when the task goes out of memory at the wrong moment (there are a few sensitive points in the model's processing where it cannot cope with being saved & reloaded).

22) Message boards : Number crunching : Persistent upload problems (Message 47626)
Posted 21 Nov 2013 by Profile MikeMarsUK
Post:
Maybe it's Rutherford Appleton labs which is blocking the pings, rather than JANET itself.
23) Message boards : Number crunching : Persistent upload problems (Message 47623)
Posted 21 Nov 2013 by Profile MikeMarsUK
Post:
JANET6, the Joint Academic NETwork ('ja.net' in the log above) is the private 2Tb/s academic network here in the UK. It blocks pings, so that is why your tracert stops after that point. So actually your log doesn't look too bad.

http://en.wikipedia.org/wiki/JANET


From here (work, not home), mine looks similar. While I'm in the UK, our corporate network proxies out to Florida hence the 5000 mile detour.


Tracing route to rapid-watch.badc.rl.ac.uk [130.246.191.84]
over a maximum of 30 hops:
...
9 116 ms 117 ms 118 ms ge-6-23.car2.Orlando1.Level3.net [4.79.118.181]

10 148 ms 117 ms 117 ms ae-2-9.bar2.Orlando1.Level3.net [4.69.133.70]
11 117 ms 117 ms 116 ms ae-0-11.bar1.Orlando1.Level3.net [4.69.137.145]

12 218 ms 217 ms 217 ms ae-8-8.ebr1.Atlanta2.Level3.net [4.69.137.150]
13 221 ms 217 ms 217 ms ae-6-6.ebr1.Washington12.Level3.net [4.69.148.10
6]
14 223 ms 225 ms 224 ms ae-1-100.ebr2.Washington12.Level3.net [4.69.143.
214]
15 218 ms 217 ms 217 ms 4.69.148.49
16 217 ms 217 ms 217 ms ae-41-41.ebr2.London1.Level3.net [4.69.137.65]
17 221 ms 218 ms 219 ms vlan102.ebr1.London1.Level3.net [4.69.143.89]
18 217 ms 217 ms 217 ms ae-4-4.car1.Manchesteruk1.Level3.net [4.69.133.1
01]
19 217 ms 216 ms 265 ms 195.50.119.98
20 218 ms 274 ms 239 ms ae29.erdiss-sbr1.ja.net [146.97.33.41]
21 223 ms 223 ms 223 ms ae31.londpg-sbr1.ja.net [146.97.33.21]
22 * * * Request timed out.
23 * * * Request timed out.
24 * * * Request timed out.
25 * * * Request timed out.
26 * * * Request timed out.
27 * * * Request timed out.
28 * * * Request timed out.
29 * * * Request timed out.
30 * * * Request timed out.

Trace complete.
24) Questions and Answers : Windows : not accepting requests from this host (Message 47600)
Posted 18 Nov 2013 by Profile MikeMarsUK
Post:

Yes, if the task gets suspended at the wrong moment it can kill the model. I prefer to change the settings so that the model keeps running regardless of computer activity.

25) Message boards : Number crunching : NEW BOINC VERSION (Message 47595)
Posted 18 Nov 2013 by Profile MikeMarsUK
Post:
... A common complaint is also about the messages tab. The v7 client has an "event log" which opens in a separate window and shows the same stuff the messages tab used to show.


Yes, the missing messages tab is very irritating (extra clicks needed to find it), but apart from that, the version of 7 I am on seems OK.

26) Message boards : Number crunching : A different upload problem -- out of disk space on rapid-watch.badc.rl.ac.uk (Message 47560)
Posted 13 Nov 2013 by Profile MikeMarsUK
Post:
I assume it is only to stop me missing crunching CPDN so much but I now have another project telling me it is out of disk space :)


They must be jealous! :-)
27) Message boards : Number crunching : A different upload problem -- out of disk space on rapid-watch.badc.rl.ac.uk (Message 47536)
Posted 11 Nov 2013 by Profile MikeMarsUK
Post:
...11/11/2013 14:35:05 | climateprediction.net | [error] Error reported by file upload server: Server is out of disk space
...
Has anyone any idea when this problem will be resolved?


Nope, this server is hosted by a third-party establishment outside the Oxford University. They've been notified, but it could take quite a while.

28) Message boards : Number crunching : Remaining time increased by 50 hours (Message 47475)
Posted 5 Nov 2013 by Profile MikeMarsUK
Post:

The important thing is to check is that it is making progress (percentage increases) rather than stuck in a loop.

If it is just the time estimate, that may have happened when your last model uploaded on the 3rd.
29) Questions and Answers : Wish list : Using GPUs for number crunching (Message 47388)
Posted 22 Oct 2013 by Profile MikeMarsUK
Post:
...Might be worthwhile for CPDN to "lose" 3 to 6 months "recoding" the project for GPU's ....


Like Dave says, it would probably take man-decades rather than a few months. The simple models we run are about a million lines of code, and the next generation is 10 million lines of codes. If they start today, perhaps they might finish on the 19th Jan 2038, but probably they won't :-)

In any case, GPUs can perform only relatively simple tasks. Complex ad-hoc code is out of their scope.
30) Message boards : Number crunching : Persistent upload problems (Message 47375)
Posted 21 Oct 2013 by Profile MikeMarsUK
Post:
... but Windows machines were using TCP/IP very inefficiently over that class of link. That's acknowledged in the Microsoft Technet article on Tcp1323Opts
...


Hmmm very interesting. I wonder if that explains why my 24Mb/s link at home is so painfully slow at peak times. I will have to experiment this evening to see if it helps with local congestion.

31) Questions and Answers : Unix/Linux : hnddler vs handler (Message 47312)
Posted 13 Oct 2013 by Profile MikeMarsUK
Post:
Eirik Redd, does this happen when running other projects?

Ed: I know WCG also uses the "file_upload_handler" syntax.


He has gone over to Boinc v7.0.xx which fixes the problem, so he won't be seeing it any more. According to Alex's linked thread, it was a Boinc issue where the string is corrupted if there were trailing / leading spaces.
32) Message boards : climateprediction.net Science : Climate models may be wrong (Message 47311)
Posted 13 Oct 2013 by Profile MikeMarsUK
Post:
... and that especially applies to developing nations like China and India.


I would disagree there. Per-capita, emissions are far high in the developed countries, and as a result, first world countries are the biggest culprits. If you want to start reducing population, you should be looking there first, I am curious to know why you are looking at the third world first.

... Well, I have a nice long response here, you should be busy micro-quoting.


Happy?
33) Message boards : Number crunching : Reporting - Errors while computing - (Message 47310)
Posted 13 Oct 2013 by Profile MikeMarsUK
Post:
I agree with the above two. Just to expand on Dave's point, after you have finished looking at your antivirus's options, if you are looking on the website, the boinc settings are found in Account / computing options.

* Suspend work while computer is in use? no

* Suspend work if CPU usage is above 0 %

* Leave tasks in memory while suspended? yes
Suspended tasks will consume swap space if 'yes'



Having these three settings mean that the task will stay in memory rather than being pushed out & reloaded repeatedly. You have plenty of memory, so it should be fine to keep them in memory.
34) Message boards : Number crunching : failed upload: can't resolve hostname (Message 47255)
Posted 8 Oct 2013 by Profile MikeMarsUK
Post:


Sounds like one of the problems I am having, raised in this message thread.



I don't think it is the same thing. The uploads in this thread were failing because the name of the upload server was spelt wrong in the configuration file (apid-wattch), whereas your log file shows that the server name is spelt correctly (rapid-watch).

I can't see any obvious reason in your log files for it going wrong. What sort of filewall do you use? It may be worth taking a look at the firewall and antivirus logs to see if anything is appearing there (some security software blocks big zip files, for example, they may appear as a 'compression bomb' in the log).
35) Message boards : Number crunching : FIX FOR BAD TIME REMAINING ESTIMATES? (Message 47254)
Posted 8 Oct 2013 by Profile MikeMarsUK
Post:
Perhaps info in this thread may help


I am currently getting a page not found error on that link.


That was a page on the phpBB forum, it was closed down a few months ago unfortunately.
36) Message boards : Number crunching : failed upload: can't resolve hostname (Message 47245)
Posted 7 Oct 2013 by Profile MikeMarsUK
Post:
a) I did already ask for that (although there is no sign that it has happened),
and b) I am not sure yet whether this remapping works or not, note the following post from earlier in the thread:


Mhhh, looks like there is another problem now. It is certainly uploading the data, but I get another error message:

12-Sep-2013 14:44:56 [climateprediction.net] Temporarily failed upload of hadcm3n_o5ss_1980_40_008385337_3_1.zip: transient upload error
12-Sep-2013 14:44:56 [climateprediction.net] Backing off 2 hr 1 min 41 sec on upload of hadcm3n_o5ss_1980_40_008385337_3_1.zip


I was hoping that someone would come back & confirm whether it works (or doesn't work) before I re-raise this with the admins.
37) Message boards : Number crunching : Compute Errors / Bad Work Units? (Message 47215)
Posted 30 Sep 2013 by Profile MikeMarsUK
Post:


I personally would use the following settings:

* Suspend work if CPU usage above %
0 (i.e., do not suspend)


* Leave tasks in memory while suspended?
Yes


* Suspend work while computer is in use?
No



As WB8ILI says, when you are shutting down your PC, first suspend boinc, wait a few moments, then shut down Boinc. This gives the models a chance to shut down cleanly rather than being killed by Windows during the shutdown process. Similarly, if you are about to do something intensive on the PC (for example, gaming), then it is a good idea to shut down boinc then also.
38) Message boards : Number crunching : Trickles not appearing on task list (Message 47156)
Posted 23 Sep 2013 by Profile MikeMarsUK
Post:
Most of that stuff only appears when the task is completed. Granted credit is the only one which will appear before the end (it is updated nightly, roughly GMT midnight, based on the trickles seen so far).

The server was down this weekend, so the last time that the trickle->credit processing ran was Friday night. It should be run tonight as normal, so tomorrow the task will show granted credit for the trickles so far.
39) Questions and Answers : Windows : Optimise PC build for CPDN (Message 47138)
Posted 21 Sep 2013 by Profile MikeMarsUK
Post:

I do have a UPS also - ... I'm not currently running it because the power supply has been much improved and I no longer get powercuts.


Having made the mistake of saying that I don't get power-cuts any more, yesterday the power went and I lost 5 of my 6 models...

If this happens again, I might need to invest in a new set of batteries.
40) Message boards : Number crunching : still don't get credits since last breakdown (Message 47108)
Posted 18 Sep 2013 by Profile MikeMarsUK
Post:
yes, the beta credits were doubled for a while. I think it is ok now.


Previous 20 · Next 20

©2024 climateprediction.net