climateprediction.net home page
Posts by wateroakley

Posts by wateroakley

21) Message boards : Number crunching : The uploads are stuck (Message 68006)
Posted 23 Jan 2023 by wateroakley
Post:
Thank you for the update Glenn. I have suspended network activity until cpdn storage capacity becomes available.
22) Message boards : Number crunching : The uploads are stuck (Message 67984)
Posted 23 Jan 2023 by wateroakley
Post:
The ubuntu VM here has nine tasks waiting to upload their 122 transfers and a task queue of about 12 days. At the present task-rate, the ubuntu VM disc is estimated to fill up in four to five days.
23) Message boards : Number crunching : w/u failed at the 89th zip file (Message 67934)
Posted 21 Jan 2023 by wateroakley
Post:
I've had a couple of these errors on an ubuntu VM. The advice from Glenn was to allow 5GB per IFS task and run a maximum number of openIFS tasks equal to n-1 physical cpu cores. To avoid any unexpected compatibility issues I also dropped running any other projects.

With a 4-core i7 cpu and 24GB memory that would be 3 concurrent openIFS tasks; with memory headroom and one physical cpu core for whatever else the computer system wants to run.

This w/u https://www.cpdn.org/result.php?resultid=22269116 failed with a most informative error

"double free or corruption (out)"

Anybody had one of these? Just curious what it might mean??
Ta
Nairb
24) Message boards : Number crunching : The uploads are stuck (Message 67678)
Posted 13 Jan 2023 by wateroakley
Post:
23:15 GMT 13 Jan 2023. Here, the backlog of uploads has just cleared a few minutes ago. There is about six days of openIFS work cached with deadlines from Sat 21 Jan 2023 to Mon 23 Jan 2023. They are followed by the next openIFS task deadllines from 10 to 12 Feb 2023. Provided everything is left running 24/7, it all looks doable here, within the deadlines.
Ok. do you have a rough idea of how many extra days you need? I've already raised this with CPDN following Richard's message about the grace period. I'm in a meeting with them on Monday so I can talk about it then if you give me a rough estimate?
25) Message boards : Number crunching : OpenIFS tasks : make sure boinc client option 'Leave non-GPU tasks in memory' is selected! (Message 67640)
Posted 13 Jan 2023 by wateroakley
Post:
ktf
I’d concur with Andrey. 8GB ram is less than the minimum recommended for Ubuntu (4GB) and a current openIFS task (about 5GB). Without enough ram, your three or four cpdn tasks will get continually sent to disc whenever a swap occurs - that’s a recipe for crashes. One openIFS task may be ok. From experience of Ubuntu and cpdn in a VM, it’s not going to be happy with anything less than 10 or 11 GB ram. Your celeron cpu should support 16GB ram, subject to the mobo and chipset limitations. That should let you run one or two of the current openIFS tasks. You can set the BOINC options to limit the cpu count to nn% and thereby limit it to one or two tasks. However, the upcoming ram requirements, that Glenn has indicated are much higher for future models, are going to exceed the maximum possible ram of your celeron cpu. H.
26) Message boards : Number crunching : The uploads are stuck (Message 67622)
Posted 12 Jan 2023 by wateroakley
Post:
I'm OK with the current mix of task run time, file size, file numbers and our upload broadband speed, even with the outage, it's manageable for me. If needed, I can increase the VM disc for ubuntu to over 900GB.

If it really troubles people to see so many uploads building up in number we can modify it for these longer model runs (3 month forecasts).
27) Message boards : Number crunching : The uploads are stuck (Message 67607)
Posted 12 Jan 2023 by wateroakley
Post:
Thank you for the update Glenn.
Update on the upload server 11:15GMT

Had email from CPDN that they are moving data off the upload server, will be sometime before they can enable httpd again. Wasn't given a time estimate but they have to move 25Tb and last downtime it took them best part of a day to move the data from the broken upload server.
28) Message boards : Number crunching : OpenIFS Discussion (Message 67510)
Posted 10 Jan 2023 by wateroakley
Post:
Thank you Glenn for the regular communication and updates.

As an IT programme director in a previous life 'communicate, communicate, communicate' was the most important requirement for an effective programme team,
29) Message boards : Number crunching : Why does this task fail ? (Message 67508)
Posted 10 Jan 2023 by wateroakley
Post:
So ...
1) Why has the tasked crashed ? How can I find any information on this ?
2) How can I recognize, that a task is crashed, if BOINC doesn't tell anything about a crash?
The openIFS thread discusses the 'file absent' problem. If you take a look at your /var/log/syslog file for the entries around the time the task finished, there should be a mention of 'oifs_43r3_ ... file absent'. Have a look at what was happening immediately before the crash. Here, other software in the ubuntu VM was complaining and subsequently the openIFS task threw a 'file absent' wobbly. The key message from Glenn was to make sure to 'Leave non-GPU tasks in memory while suspended' (tick the box in 'Boinc - computing preferences, Disk and memory').. Since i've done this, the only two task crashes have been due to a power brown-out a couple of days ago.
30) Message boards : Number crunching : If you have used VirtualBox for BOINC and have had issues, please can you share these? (Message 67221)
Posted 2 Jan 2023 by wateroakley
Post:
We've been runing Virtualbox 6.1 with ubuntu 20.04.1 VM for 18 months and Mac Mojave VM for 9 months.

1. i7-3770 WIN10 host with 32GB of physical RAM.
When sizing the VM memory, the Windoze host will need a minimum of 8GB RAM reserved for Windoze to play in.
You'll need at least 24GB of physical RAM, peferably 32GB. In practice, 16GB physical RAM was insufficient.
The on-line tutorials all create a very small VM disc, far too small. If you need to make a much bigger linux VM disc, try GParted. About 40-100GB has been good so far.
The run-time cpdn crash issues were (in part) from Windoze updates unexpectedly rebooting the host. The answer is to pause Windoze updates for as long as you can and manage the WIN10 updates manually.

2. i7-8700 WIN10 host with 40GB of physical RAM
The ubuntu and Mojave VMs were migrated to this host .
VirtualBox 6.1 is happily running the migrated ubuntu 20.04.1 VM with BOINC/cpdn.
Increasing the 40GB vdi disc size in VirtualBox to 100GB created a 990GB VM disc partition. Oops, fat fingers had added an extra zero. There is no way to reduce the VM disc partition size in VirtualBox without creating a new vdi and moving the VM files around.
Set the number of VM cpus to N-1 physical cores. Tick 'Leave non-GPU tasks in memory while suspended'.
Edit: With the VM running 5 cpdn openIFS tasks, Resource Monitor reports 39GB Physical Memory in use and 1.6GB of cache.
The indicated requirements for upcoming openIFS tasks, says that more physical RAM will likely be needed.
Unfortunately, the new host does not like the Mac Mojave VM. Neither the migrated copy nor a fresh install get beyond starting the shell. I suspect it's unhappy with the EFI boot? Any suggestions on fixing this are welcome.

3. Quad-core Q9650 with 8GB physical RAM.
We got the ubuntu VM and BOINC/cpdn running. However, it ran one cpdn task, very slowly.
QED: don't bother.
31) Message boards : Number crunching : OpenIFS Discussion (Message 67012)
Posted 22 Dec 2022 by wateroakley
Post:
If you could kindly check your /var/log/syslog file for an entry around the time the task finished. There should be mention of 'oifs_43r3_' something being killed. Let me know what there is.

Out of interest, how many tasks did you have running compared to how many cores? I have got a 11th & 3rd gen Intel i7 and the model has never crashed like this for me. The only suggestion I can make is not to put too many tasks on the machine. Random memory issues like this can depend on how busy memory is. I have one less task than I have cores running (note cores not threads) i.e. 3 tasks max for a 4 core machine. So far, touch wood, it's never crashed and I'm nowhere near my total ram. I was going to do a test by letting more tasks run to see what happens once I've done a few successfully. It's quite tough to debug without being able to reproduce.

thx.
Glenn, as requested, I've pasted the syslog file below, from the time that the WU 12168039 task_o (https://www.cpdn.org/result.php?resultid=22252269) appears to behave (at 05:51am with zip 42 upload) to when it failed (immediately after zip 43) and decided that it had finished at 06:02am. Nothing being killed, but other things do start to go wrong around 05:55am.

The i7-8700 host has six physical cores, 12 virtual cores, The VirtualBox ubuntu VM is configured with 6 cpu and running 6 cpdn tasks. Boinc preferences are configured to use 100% of the cpus and 'Suspend' when non-BOINC cpu is above 75%. The VM is only used for boinc/cpdn work. Should I drop the VM back to 5 CPU?

Interestingly, today we had a power brownout at 1pm and the PC unceremoniously crashed. After restarting, all six IFS tasks restarted successfully :)) That's a first - I've rarely had 100% restart success for non-IFS tasks after a crash.

syslog
Dec 21 05:51:27 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:51:27 [climateprediction.net] Started upload of oifs_43r3_ps_0395_1982050100_123_951_12168039_0_r1962384054_42.zip
Dec 21 05:51:27 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:51:27 [climateprediction.net] Started upload of oifs_43r3_ps_0401_1982050100_123_951_12168045_0_r692153421_41.zip
Dec 21 05:51:47 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:51:47 [climateprediction.net] Finished upload of oifs_43r3_ps_0395_1982050100_123_951_12168039_0_r1962384054_42.zip
Dec 21 05:51:49 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:51:49 [climateprediction.net] Finished upload of oifs_43r3_ps_0401_1982050100_123_951_12168045_0_r692153421_41.zip
Dec 21 05:52:52 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:52:52 [climateprediction.net] Started upload of oifs_43r3_ps_0425_1982050100_123_951_12168069_0_r777869862_42.zip
Dec 21 05:53:04 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:53:04 [climateprediction.net] Finished upload of oifs_43r3_ps_0425_1982050100_123_951_12168069_0_r777869862_42.zip
Dec 21 05:54:52 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:54:52 [climateprediction.net] Started upload of oifs_43r3_ps_0422_1982050100_123_951_12168066_0_r1003053545_42.zip
Dec 21 05:55:03 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:55:03 [climateprediction.net] Finished upload of oifs_43r3_ps_0422_1982050100_123_951_12168066_0_r1003053545_42.zip
Dec 21 05:55:08 ih2-VirtualBox dbus-daemon[992]: [session uid=1000 pid=992] Activating via systemd: service name='org.freedesktop.Tracker1' unit='tracker-store.service' requested by ':1.3' (uid=1000 pid=970 comm="/usr/libexec/tracker-miner-fs " label="unconfined")
Dec 21 05:55:08 ih2-VirtualBox systemd[917]: Starting Tracker metadata database store and lookup manager...
Dec 21 05:55:08 ih2-VirtualBox dbus-daemon[992]: [session uid=1000 pid=992] Successfully activated service 'org.freedesktop.Tracker1'
Dec 21 05:55:08 ih2-VirtualBox systemd[917]: Started Tracker metadata database store and lookup manager.
Dec 21 05:55:39 ih2-VirtualBox tracker-store[11393]: OK
Dec 21 05:55:39 ih2-VirtualBox systemd[917]: tracker-store.service: Succeeded.
Dec 21 05:55:54 ih2-VirtualBox dbus-daemon[660]: [system] Activating via systemd: service name='net.reactivated.Fprint' unit='fprintd.service' requested by ':1.44' (uid=1000 pid=1289 comm="/usr/bin/gnome-shell " label="unconfined")
Dec 21 05:55:54 ih2-VirtualBox systemd[1]: Starting Fingerprint Authentication Daemon...
Dec 21 05:55:54 ih2-VirtualBox dbus-daemon[660]: [system] Successfully activated service 'net.reactivated.Fprint'
Dec 21 05:55:54 ih2-VirtualBox systemd[1]: Started Fingerprint Authentication Daemon.
Dec 21 05:55:58 ih2-VirtualBox gnome-shell[1289]: cr_parser_new_from_buf: assertion 'a_buf && a_len' failed
Dec 21 05:55:58 ih2-VirtualBox gnome-shell[1289]: cr_declaration_parse_list_from_buf: assertion 'parser' failed
Dec 21 05:55:58 ih2-VirtualBox NetworkManager[663]: <info>  [1671602158.6148] agent-manager: agent[a34c130ad9aefeac,:1.44/org.gnome.Shell.NetworkAgent/1000]: agent registered
Dec 21 05:55:58 ih2-VirtualBox dbus-daemon[992]: [session uid=1000 pid=992] Activating service name='org.freedesktop.FileManager1' requested by ':1.37' (uid=1000 pid=1289 comm="/usr/bin/gnome-shell " label="unconfined")
Dec 21 05:55:58 ih2-VirtualBox gnome-shell[1289]: cr_parser_new_from_buf: assertion 'a_buf && a_len' failed
Dec 21 05:55:58 ih2-VirtualBox gnome-shell[1289]: cr_declaration_parse_list_from_buf: assertion 'parser' failed
Dec 21 05:55:58 ih2-VirtualBox dbus-daemon[992]: [session uid=1000 pid=992] Activating service name='org.gnome.Nautilus' requested by ':1.37' (uid=1000 pid=1289 comm="/usr/bin/gnome-shell " label="unconfined")
Dec 21 05:55:58 ih2-VirtualBox gnome-shell[1289]: cr_parser_new_from_buf: assertion 'a_buf && a_len' failed
Dec 21 05:55:58 ih2-VirtualBox gnome-shell[1289]: cr_declaration_parse_list_from_buf: assertion 'parser' failed
Dec 21 05:55:58 ih2-VirtualBox dbus-daemon[992]: [session uid=1000 pid=992] Successfully activated service 'org.gnome.Nautilus'
Dec 21 05:55:58 ih2-VirtualBox org.gnome.Nautilus[11423]: Failed to register: Unable to acquire bus name 'org.gnome.Nautilus'
Dec 21 05:55:59 ih2-VirtualBox dbus-daemon[992]: [session uid=1000 pid=992] Successfully activated service 'org.freedesktop.FileManager1'
Dec 21 05:55:59 ih2-VirtualBox gnome-shell[1289]: cr_parser_new_from_buf: assertion 'a_buf && a_len' failed
Dec 21 05:55:59 ih2-VirtualBox gnome-shell[1289]: cr_declaration_parse_list_from_buf: assertion 'parser' failed
Dec 21 05:55:59 ih2-VirtualBox gnome-shell[1289]: Window manager warning: Overwriting existing binding of keysym 31 with keysym 31 (keycode a).
Dec 21 05:55:59 ih2-VirtualBox gnome-shell[1289]: Window manager warning: Overwriting existing binding of keysym 38 with keysym 38 (keycode 11).
Dec 21 05:55:59 ih2-VirtualBox gnome-shell[1289]: Window manager warning: Overwriting existing binding of keysym 39 with keysym 39 (keycode 12).
Dec 21 05:55:59 ih2-VirtualBox gnome-shell[1289]: Window manager warning: Overwriting existing binding of keysym 32 with keysym 32 (keycode b).
Dec 21 05:55:59 ih2-VirtualBox gnome-shell[1289]: Window manager warning: Overwriting existing binding of keysym 33 with keysym 33 (keycode c).
Dec 21 05:55:59 ih2-VirtualBox gnome-shell[1289]: Window manager warning: Overwriting existing binding of keysym 34 with keysym 34 (keycode d).
Dec 21 05:55:59 ih2-VirtualBox gnome-shell[1289]: Window manager warning: Overwriting existing binding of keysym 35 with keysym 35 (keycode e).
Dec 21 05:55:59 ih2-VirtualBox gnome-shell[1289]: Window manager warning: Overwriting existing binding of keysym 36 with keysym 36 (keycode f).
Dec 21 05:55:59 ih2-VirtualBox gnome-shell[1289]: Window manager warning: Overwriting existing binding of keysym 37 with keysym 37 (keycode 10).
Dec 21 05:56:25 ih2-VirtualBox systemd[1]: fprintd.service: Succeeded.
Dec 21 05:57:38 ih2-VirtualBox systemd[1]: Starting Ubuntu Advantage Timer for running repeated jobs...
Dec 21 05:57:39 ih2-VirtualBox systemd[1]: ua-timer.service: Succeeded.
Dec 21 05:57:39 ih2-VirtualBox systemd[1]: Finished Ubuntu Advantage Timer for running repeated jobs.
Dec 21 05:58:52 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:58:52 [climateprediction.net] Started upload of oifs_43r3_ps_0645_1981050100_123_950_12167289_1_r1207618552_42.zip
Dec 21 05:58:56 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:58:56 [climateprediction.net] Started upload of oifs_43r3_ps_0248_1981050100_123_950_12166892_1_r549240307_42.zip
Dec 21 05:59:04 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:59:04 [climateprediction.net] Finished upload of oifs_43r3_ps_0645_1981050100_123_950_12167289_1_r1207618552_42.zip
Dec 21 05:59:13 ih2-VirtualBox boinc[834]: 21-Dec-2022 05:59:13 [climateprediction.net] Finished upload of oifs_43r3_ps_0248_1981050100_123_950_12166892_1_r549240307_42.zip
Dec 21 06:00:08 ih2-VirtualBox dbus-daemon[992]: [session uid=1000 pid=992] Activating via systemd: service name='org.freedesktop.Tracker1' unit='tracker-store.service' requested by ':1.3' (uid=1000 pid=970 comm="/usr/libexec/tracker-miner-fs " label="unconfined")
Dec 21 06:00:08 ih2-VirtualBox systemd[917]: Starting Tracker metadata database store and lookup manager...
Dec 21 06:00:08 ih2-VirtualBox dbus-daemon[992]: [session uid=1000 pid=992] Successfully activated service 'org.freedesktop.Tracker1'
Dec 21 06:00:08 ih2-VirtualBox systemd[917]: Started Tracker metadata database store and lookup manager.
Dec 21 06:00:39 ih2-VirtualBox tracker-store[11464]: OK
Dec 21 06:00:39 ih2-VirtualBox systemd[917]: tracker-store.service: Succeeded.
Dec 21 06:02:22 ih2-VirtualBox boinc[834]: 21-Dec-2022 06:02:22 [climateprediction.net] Started upload of oifs_43r3_ps_0395_1982050100_123_951_12168039_0_r1962384054_43.zip
Dec 21 06:02:31 ih2-VirtualBox boinc[834]: 21-Dec-2022 06:02:31 [climateprediction.net] Started upload of oifs_43r3_ps_0401_1982050100_123_951_12168045_0_r692153421_42.zip
Dec 21 06:02:34 ih2-VirtualBox boinc[834]: 21-Dec-2022 06:02:34 [climateprediction.net] Finished upload of oifs_43r3_ps_0395_1982050100_123_951_12168039_0_r1962384054_43.zip
Dec 21 06:02:37 ih2-VirtualBox boinc[834]: 21-Dec-2022 06:02:37 [climateprediction.net] Computation for task oifs_43r3_ps_0395_1982050100_123_951_12168039_0 finished
Dec 21 06:02:37 ih2-VirtualBox boinc[834]: 21-Dec-2022 06:02:37 [climateprediction.net] Output file oifs_43r3_ps_0395_1982050100_123_951_12168039_0_r1962384054_44.zip for task oifs_43r3_ps_0395_1982050100_123_951_12168039_0 absent
Dec 21 06:02:37 ih2-VirtualBox boinc[834]: 21-Dec-2022 06:02:37 [climateprediction.net] Output file oifs_43r3_ps_0395_1982050100_123_951_12168039_0_r1962384054_45.zip for task oifs_43r3_ps_0395_1982050100_123_951_12168039_0 absent
….. File absent ... 46-120
Dec 21 06:02:37 ih2-VirtualBox boinc[834]: 21-Dec-2022 06:02:37 [climateprediction.net] Output file oifs_43r3_ps_0395_1982050100_123_951_12168039_0_r1962384054_121.zip for task oifs_43r3_ps_0395_1982050100_123_951_12168039_0 absent
Dec 21 06:02:37 ih2-VirtualBox boinc[834]: 21-Dec-2022 06:02:37 [climateprediction.net] Output file oifs_43r3_ps_0395_1982050100_123_951_12168039_0_r1962384054_122.zip for task oifs_43r3_ps_0395_1982050100_123_951_12168039_0 absent
32) Message boards : Number crunching : OpenIFS Discussion (Message 66997)
Posted 21 Dec 2022 by wateroakley
Post:
It's possible AMD chips are triggering memory bugs in the code depending on what else happens to be in memory at the same time (hence the seemingly random nature of the fail). Hard to say exactly at the moment but it could also been something system/hardware related specific to Ryzens. I have never seen the model fail like this before on the processors I've worked with in the past (none of which were AMD unfortunately). I am tempted to turn down the optimization and see what happens....

I did a little bit of searching and found 3 tasks that failed with errors you described on intel processors. I think it might be too early to attribute these errors as ryzen specific.
https://www.cpdn.org/result.php?resultid=22245369
Exit status 5 (0x00000005) Unknown error code
double free or corruption (out)

Ah thanks. That's useful, and the first time I've seen a error code 5 (I/O error) but consistent with something file related. Frustrating there is no traceback from the model - which is why I didn't think it was the model in the first place.

I can spend time looking at this but it's the law of diminishing returns. The OpenIFS batch error rate is now 10%, HadSM4's is about 5%. It would be nice to get it lower but I also need to move the higher resolution work on which I've not been able to start yet. This is quite important to attract scientists to the platform.

Hello Glenn,

https://www.cpdn.org/result.php?resultid=22252269

This task crashed earlier today with double free or corruption (out). It’s an IFS task running in VirtualBox, ubuntu 20.04, on an Intel i7-8700, WIN10 host, running ubuntu VM https://www.cpdn.org/show_host_detail.php?hostid=1512045. The VM has 32GB RAM assigned (40GB physical) and about 100GB disc (2TB physical)


The only touches today around 6:00am:
a) In the Win10 host, I updated our daily energy usage in excel and saved the two files.
b) In the ubuntu VM, I looked to see what had happened overnight in the BOINC event log.

No changes to the ubuntu host or BOINC manager. No stops or restarts, no config changes.

The other five IFS tasks have about two hours to go.


Stderr



06:00:54 STEP 1054 H=1054:00 +CPU= 22.385
06:01:16 STEP 1055 H=1055:00 +CPU= 21.665
06:01:57 STEP 1056 H=1056:00 +CPU= 39.203
Moving to projects directory: /var/lib/boinc-client/slots/0/ICMGGhq0f+001056
Moving to projects directory: /var/lib/boinc-client/slots/0/ICMSHhq0f+001056
Moving to projects directory: /var/lib/boinc-client/slots/0/ICMUAhq0f+001056
Adding to the zip: /var/lib/boinc-client/projects/climateprediction.net/oifs_43r3_ps_12168039/ICMGGhq0f+001032
Adding to the zip: /var/lib/boinc-client/projects/climateprediction.net/oifs_43r3_ps_12168039/ICMSHhq0f+001032
Adding to the zip: /var/lib/boinc-client/projects/climateprediction.net/oifs_43r3_ps_12168039/ICMUAhq0f+001032
Adding to the zip: /var/lib/boinc-client/projects/climateprediction.net/oifs_43r3_ps_12168039/ICMGGhq0f+001044
Adding to the zip: /var/lib/boinc-client/projects/climateprediction.net/oifs_43r3_ps_12168039/ICMSHhq0f+001044
Adding to the zip: /var/lib/boinc-client/projects/climateprediction.net/oifs_43r3_ps_12168039/ICMUAhq0f+001044
Zipping up the intermediate file: /var/lib/boinc-client/projects/climateprediction.net/oifs_43r3_ps_0395_1982050100_123_951_12168039_0_r1962384054_43.zip
Uploading the intermediate file: upload_file_43.zip
06:02:19 STEP 1057 H=1057:00 +CPU= 20.970
double free or corruption (out)


If there is anything particular you would like me to look at and report, please let me know.

33) Message boards : Number crunching : OpenIFS Discussion (Message 66755)
Posted 3 Dec 2022 by wateroakley
Post:
'There *is* an issue with restarts. The model process itself restarts just fine if the client/machine is shutdown and restarted. However, the controlling wrapper code then appears to be miscalculating where the model is in the forecast and this leads to the 'missing file' problem that's been reported.

So if you can manage to keep the tasks running uninterrupted they *should* work (famous last words), or at least fail less often. I have not tried 'keep non-gpu tasks in memory' option, that might help. And I know I said OpenIFS shouldn't have restart problems, but it's not the fault of the model ;)
Thanks Glenn. After a lot of hit and miss tasks, the last six 'uninterrupted' OpenIFS tasks here have completed!
34) Message boards : Number crunching : OpenIFS Discussion (Message 66704)
Posted 1 Dec 2022 by wateroakley
Post:
[ADSL people: knowing your bottleneck is network, are you happy just reducing the no. of tasks running concurrently? What's your sustainable data-flow rate you would be happy with (give me a number to work with).
The broadband uplink here is 12Mbps and downlink at 40Mbps. It's pretty consistent at that speed. The event log showed that uploads from six concurrent tasks over the past few days are taking 12-15 seconds each, which is not giving a network headache. A single new task download (3 jf_c... files) is less than two minutes.
35) Message boards : Number crunching : OpenIFS Discussion (Message 66700)
Posted 1 Dec 2022 by wateroakley
Post:
To get the openIFS tasks to run on this VirtualBox ubuntu host: https://www.cpdn.org/results.php?hostid=1512045, I increased the ubuntu VM disc partition from 40GB to 100GB (gparted).

After five early openIFS successes, the subsequent tasks have crashed with one error or another. The event log has reported a lot of 'file absent' records, with no obvious local reason that I can see,
This afternoon I've increased the memory allocated to the ubuntu VM from 28GB to 32GB and reduced cpus (tasks running) from six to four.

On a positive note, after the reboot all the suspended tasks started up successfully!
36) Message boards : Number crunching : Hardware for new models. (Message 66333)
Posted 10 Nov 2022 by wateroakley
Post:
Even so, about 100W is nothing. A fridge gives that off. Open the window.
Are you sure? An E-rated American style fridge-freezer is about 350 kWh a year, that's 40W. A B-rated fridge is about 137 kWh a year, or 15W. Our i7-cpu based PCs pull over 100W without the monitors, about 900kWh a year. The waste heat from two PCs is warming our home office today, very slightly.
37) Questions and Answers : Unix/Linux : What does: "negative theta detected" mean? (Message 66276)
Posted 30 Oct 2022 by wateroakley
Post:
My UBUNTU 20.04 Laptop R7-5800H with 16 GB ran a single HadSM4 at N144 unit which errored out after six minutes with the following report. Was this due to missing libraries which I meanwhile installed?
Model crashed: ATM_DYN : NEGATIVE THETA DETECTED.
]]>[/size]
The work unit will have crashed with NEGATIVE THETA due to an unfortunate choice of initial conditions. The crash is not due to missing libraries.
38) Message boards : Number crunching : Windows-based tasks? (Message 66195)
Posted 16 Oct 2022 by wateroakley
Post:
I took a break from Climateprediction some time ago, as there appeared to be no Windows-based tasks available. Now retried, for a week, again no joy. Should I let it be?
CPDN will run on different flavours of Linux VMs, including WSL and VirtualBox.
This message link and thread: https://www.cpdn.org/forum_thread.php?id=9025&postid=64962#64962 gives a rundown on using Oracle VirtualBox 6 and ubuntu 20.04 LTS:
Make sure you find the correct 32 bit libraries for your Linux version: https://www.cpdn.org/forum_thread.php?id=8916
Please note the importance of pausing and managing M$ Windoze updates, otherwise the updates will routinely crash models. The ubuntu VM has migrated quite happily to a newer PC with Windows 10 and VirtualBox.

I've also got a Mac Mojave VM running and posted details on how to install it here: https://www.cpdn.org/forum_thread.php?id=9124&postid=65231#65231
39) Message boards : Number crunching : Excuse me? Zero Credit? (Message 66170)
Posted 2 Oct 2022 by wateroakley
Post:
It used to be that the credit script was run only once a week early UTC Sunday. But I thought that had changed recently. Respond back to this thread if no credit shows up by Monday and we'll alert Andy to the problem.

Pretty certain it still just runs once a week. I haven't had any work for a while but looking at the steps on the graph from BOINC manager makes it pretty clear that if it has changed it is only in the past week.
With one resent task the credit updated today, so the script appears to be running. With the latest price of 'leccy' and no tasks, i'd set the host to 'sleep' after an hour. Interestingly, the Linux VM and CPDN task has not complained or aborted with 'sleep'. I've now reset the 'sleep' setting.
40) Questions and Answers : Windows : Future CPDN on Windows? (Message 66156)
Posted 30 Sep 2022 by wateroakley
Post:
Is CPDN planned to be able to run on Windows 10 / 11 in the future? If not, could CPDN run in a Linux VirtualBox system?
CPDN will run on different flavours of Linux VMs, including WSL and VirtualBox.
This message link and thread https://www.cpdn.org/forum_thread.php?id=9025&postid=64962#64962 gives a rundown on using Oracle VirtualBox 6 and ubuntu 20.04 LTS:
Make sure you find the correct 32 bit libraries for your Linux version: https://www.cpdn.org/forum_thread.php?id=8916
Please note the importance of pausing and managing M$ Windoze updates, otherwise the updates will routinely crash models. The VM has migrated quite happily to a newer Windows PC.

I've also got a Mac Mojave VM running too, although the VM was unhappy after the migration.


Previous 20 · Next 20

©2024 climateprediction.net