climateprediction.net home page
Posts by bernard_ivo

Posts by bernard_ivo

1) Questions and Answers : Unix/Linux : computation error at 100% complete (Message 62856)
Posted 26 days ago by bernard_ivo
Post:
I changed the values following the process suggested and restatred the client. So far all 4 WUs are running ok, 1 zip uploaded. I have one more task on another machine, but since I have relatively high speed internet I will risk with this one.
2) Questions and Answers : Unix/Linux : computation error at 100% complete (Message 62848)
Posted 27 days ago by bernard_ivo
Post:
Ok,
I have 4 WUs that progressed between 20 and 75 %. So some zips have uploaded. Should I change all instances of
<max_nbytes>150000000.000000</max_nbytes> for yeach WU or only to the remaining zips?

There are also other files have <max_nbytes>0.000000</max_nbytes> so I guess I should not alter these?
3) Message boards : Number crunching : New work Discussion (Message 62846)
Posted 27 days ago by bernard_ivo
Post:
#877 and #878 have had all tasks waiting to go out withdrawn as many produce uploads greater than the limit allowed causing some to fail at 100% completion. They will be re-issued shortly.

Edit: Those already sent out will be left to run. Credit will be granted.


I have few of these, should I let them finish or should I abort?
4) Message boards : Number crunching : New work Discussion (Message 62792)
Posted 25 Oct 2020 by bernard_ivo
Post:
It would be great if someone could post system requirements for the OpenIFS once more WUs are tested. It seems my i7-4790 with 16Gb and 21 Gb var space and 5.6 Gb swap may not be able to handle more than two WUs at once
5) Message boards : Number crunching : Welcome back/checking if everything is working? (Message 62773)
Posted 9 Oct 2020 by bernard_ivo
Post:
There's quite a few batches of those, with only a small number left in each one.
I had a look at a few; some are "stuck", but some have just started running, and are returning trickles.

I'll see what the project thinks about wiping everything.


It would be great if some clean up happens. I have one orphaned Full Resolution Ocean since 2014 in my "In progress" web tab and set to expire in 2023. I'm almost there.
6) Message boards : Number crunching : UK Met Office HadAM4 at N216 resolution (Message 62772)
Posted 9 Oct 2020 by bernard_ivo
Post:
I also got 5 of the new ones. So with 6 N216 my /var climbed to ~ 16 GB. With 4 WCG ARP in the queue I almost ran out of space on /var ~20GB and BOINC manager crashed. I needed to clean some journals. Luckily no CPDN models crashed due to the low disk issue. With reducing work to real cores and cleaning ARPs will get things back to normal.
7) Message boards : Number crunching : UK Met Office HadAM4 at N216 resolution (Message 62760)
Posted 7 Oct 2020 by bernard_ivo
Post:
And I have just picked up one from #843 as well. (On its fifth and final attempt.

I also got another one but from #842. On its second attempt after a whole year with no response. I still think deadlines should be shortened.
8) Message boards : Number crunching : UK Met Office HadAM4 at N216 resolution (Message 62743)
Posted 3 Oct 2020 by bernard_ivo
Post:
There's been quite a few fails, and several hundred still running, (possibly not for the first time), so if you put your foot down and go for it, you're in with a chance. :)

Good then, I will let it run. Thanks.
9) Message boards : Number crunching : UK Met Office HadAM4 at N216 resolution (Message 62741)
Posted 1 Oct 2020 by bernard_ivo
Post:
Yey I got one from batch 843. The task timed out after one year no response. I wonder if it is of any use except for upping my points.
10) Message boards : Number crunching : Updated BOINC Clients 7.16.11 - Windows 64-bit and Mac OS X (64-bit Intel) (Message 62723)
Posted 16 Sep 2020 by bernard_ivo
Post:
While waiting for WUs isn't it possible that some improvements are made?
Shorter deadlines, being able to select tasks, check for 32 bit libs for Linux, clean up ghost WUs, better communication......
11) Message boards : Number crunching : Big models (Message 62657)
Posted 11 Aug 2020 by bernard_ivo
Post:
Well, it looks like no one is against bigger uploads, so the researchers can go ahead with the current model.


What would be the checkpoint interval? I can't recall well, but checkpoint on my i7-4790 was 40-60 mins. Any considerations to reduce it a bit?
12) Message boards : climateprediction.net Science : Misconfigured Machine? (Message 62429)
Posted 18 May 2020 by bernard_ivo
Post:
[quote]This is interesting. The 1107 errors are not surprising, but how did he get a valid UK Met Office HadCM3 short v8.34 i686-pc-linux-gnu?
Don't they require the 32-bit libraries too?
https://www.cpdn.org/results.php?hostid=1472944

This one is still crashing around 12 WUs a day, 99.999% (1494 in total)

https://www.cpdn.org/cpdnboinc/results.php?hostid=1499785 New machine 24 WUs in the last two days
https://www.cpdn.org/cpdnboinc/results.php?hostid=1504413 New machine 25 WUs in the last two days
https://www.cpdn.org/cpdnboinc/results.php?hostid=1473091 2018 machine 100% WUs crashed (43 in total)
13) Message boards : Number crunching : Download errors on UK Met Office HadAM4 at N216 resolution v8.52 tasks (Message 62428)
Posted 15 May 2020 by bernard_ivo
Post:
I have a stuck download of batch 867 WU

Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] HTTP_OP::init_get(): http://download.cpdn.org/download//batch_867/workunits/hadam4h_a17c_209511_4_867_012013459.zip
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | Started download of hadam4h_a17c_209511_4_867_012013459.zip
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] HTTP_OP::init_get(): http://download.cpdn.org/download//batch_867/ancils/a17c_867_atmos.gz
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | Started download of a17c_867_atmos.gz
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] [ID#71828] Info: Connection 3859 seems to be dead!
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] [ID#71828] Info: Closing connection 3859
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] [ID#71828] Info: TLSv1.2 (OUT), TLS alert, Client hello (1):
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] [ID#71829] Info: Found bundle for host download.cpdn.org: 0x559f10a67390 [serially]
Fri 15 May 2020 12:38:33 PM EEST | climateprediction.net | [http] [ID#71828] Info: Trying 129.67.193.131...
..........................
Fri 15 May 2020 12:42:34 PM EEST | climateprediction.net | Temporarily failed download of ic_N216_2002_12_000004.nc.gz: transient HTTP error
Fri 15 May 2020 12:42:34 PM EEST | climateprediction.net | Backing off 00:04:42 on download of ic_N216_2002_12_000004.nc.gz
Fri 15 May 2020 12:42:34 PM EEST | climateprediction.net | Temporarily failed download of HAPPI_1.5K_sst_N216_2095-10-01_2096-04-30.gz: transient HTTP error
Fri 15 May 2020 12:42:34 PM EEST | climateprediction.net | Backing off 00:04:19 on download of HAPPI_1.5K_sst_N216_2095-10-01_2096-04-30.gz
14) Message boards : climateprediction.net Science : Misconfigured Machine? (Message 62406)
Posted 7 May 2020 by bernard_ivo
Post:
Can someone make this one a sticky? Is there a way to block all non-compliant PCs. It takes a lot of resources to report misconfigured machines


https://www.cpdn.org/results.php?hostid=1498245
https://www.cpdn.org/results.php?hostid=1493995 this has been reported in November but hasn't been blocked
https://www.cpdn.org/results.php?hostid=1503343 a new fella created 28 April
https://www.cpdn.org/results.php?hostid=1472002

https://www.cpdn.org/cpdnboinc/show_host_detail.php?hostid=1496283
https://www.cpdn.org/cpdnboinc/show_host_detail.php?hostid=1479116
https://www.cpdn.org/cpdnboinc/show_host_detail.php?hostid=1484003
15) Message boards : Number crunching : New Model Type HadAM4 (Message 62397)
Posted 4 May 2020 by bernard_ivo
Post:
I also have few WUs with this error, however they all finished successfully
https://www.cpdn.org/cpdnboinc/result.php?resultid=21927138
https://www.cpdn.org/cpdnboinc/result.php?resultid=21920061
This computer is on heavy usage with other tasks, hence 2 WUs only but I suspect it just can't handle all the load (i7-3520m 8GB RAM)
16) Message boards : Number crunching : Server status page shows different numbers for tasks in progress (Message 62336)
Posted 22 Apr 2020 by bernard_ivo
Post:
That would be great. Would it be possible to remove orphaned WUs also?
These are WUs still "In progress" but no longer at the user's system (even after detach/attach)
17) Message boards : Number crunching : No trickles on webpage (Message 62275)
Posted 1 Apr 2020 by bernard_ivo
Post:
Trickles seem to appeared today. Thanks.
18) Message boards : Number crunching : No trickles on webpage (Message 62270)
Posted 30 Mar 2020 by bernard_ivo
Post:
Hi
It seems there might be a problem with trickles, after 21-22 March
I have at least 3 N216 that do not have their 3&4 trickle on the web despite they finished successfully and upload queues are empty.

here is an example https://www.cpdn.org/cpdnboinc/result.php?resultid=21871312
19) Message boards : Number crunching : New work Discussion (Message 62163)
Posted 27 Feb 2020 by bernard_ivo
Post:
I still believe one way to go is to shorten WU's deadline. There is not so much output of completed windows tasks per 24h compared to tasks in progress. Linux boxes though currently fewer send back higher % tasks than window boxes relative to tasks in progress. This might suggest that even if a user is not hoarding, still tasks may be at rest due to other projects priority.

Edit: And yes there are whole model categories both Linux & Win, that haven't received ready tasks recently despite queued tasks in progress. (sure there are ghost WUs as well)
20) Message boards : climateprediction.net Science : Climate change in the News (Message 62131)
Posted 18 Feb 2020 by bernard_ivo
Post:
Isambard 2 at UK Met Office to be largest Arm supercomputer in Europe

The UK Met Office been awarded £4.1m by EPSRC to build Isambard 2, the largest Arm-based supercomputer in Europe. The powerful new £6.5m facility, to be hosted by the Met Office in Exeter and utilized by the universities of Bath, Bristol, Cardiff and Exeter, will double the size of GW4 Isambard, to 21,504 high performance cores and 336 nodes......

Isambard 2 will continue to support their efforts in developing future systems for weather forecasting and climate predictions...

Details here
https://insidehpc.com/2020/02/isambard-2-at-uk-met-office-to-be-largest-arm-supercomputer-in-europe/

Shall we see CPDN for ARM?


Next 20

©2020 climateprediction.net