climateprediction.net (CPDN) home page
Posts by Dark Angel

Posts by Dark Angel

InfoMessage
1) Message boards : News : Statement on the recent disruption to the climateprediction.net project
Message 71508
Posted 19 Sep 2024 by Dark Angel
@rebranding@ means two things
- Someone allowed a marketing drone out of their enclosure and even worse actually listened to them
- The organisation considers it's brand a failure.
Neither are positive.
2) Message boards : Number crunching : Tasks available, but I am not getting them.
Message 71214
Posted 6 Aug 2024 by Dark Angel
Hang on a moment - this thread is getting very muddled. Where did we get the idea that benchmarks hadn't run from? And whose account, which host, were we talking about at the time?

It seems to have started with AndreyOR's message 71185 - but that wasn't replying to a specific prior post, and the reference to 'you' is ambiguous.

I'm assuming that the reference was to Dark Angel, who is showing two computers on his account:

Linux host 1534740
Windows host 1548438 - further assumed to be running in a VM on the Linux machine, per message 71172

As I type now, both hosts are showing a normal 'measured speed' - Linux 3.14GHz, Windows 5.73GHz (*)

Where exactly did the 'no benchmark' idea come from?

* the difference in speed between the two instances is itself interesting, but for another thread.

No need to assume I have said plainly, my Windows 10 host is running in a VM. It has 16GB of RAM allocated and 4 CPU cores.
The host machine is my Linux box.
Further, I have manually elevated the CPDN tasks in the Windows VM to run at "high" priority. The VM itself is running with a nice of 0 while native Linux tasks will get a nice of 19 the same as any other Boinc task.

I don't know where that 5.73GHz figure came from, the Windows machine reads that it is running at 3.7GHz, the base clock of the CPU, and while I have adaptive clocking enabled I do not overclock my hardware. It just wears things out too fast for my liking.
3) Message boards : Number crunching : Tasks available, but I am not getting them.
Message 71204
Posted 5 Aug 2024 by Dark Angel
Yes, the installations are independent but they're still using the same CPU resources so you (I) have to make sure to not allow both Boinc installs to overlap in what they're using. I've restricted my host Boinc from using the cores allocated to the VM plus a couple of spares as a buffer.
I have my host Boinc installed on it's own drive and my VMs reside on a different physical drive. Both are separate from the host OS drive. This is very deliberate to prevent both I/O bottlenecks as well as to prevent my Boinc install from wearing out my nvme host drive from excessive writes (problem when running LHC CMS work in particular. I also run a network proxy to mitigate the amount of data LHC downloads for each work unit when I'm running that project).
4) Message boards : Number crunching : Tasks available, but I am not getting them.
Message 71202
Posted 5 Aug 2024 by Dark Angel
I suspect it might also matter what other projects are running on the CPU at the time. Currently I'm working through a cache of WCG and Einstein (on GPU) work on the host with work limits set in app_config to keep threads free for the VM. WCG is less demanding on the system than say LHC work, especially for RAM, disk, and network I/O. Milkyway used to be very hard on the FPU due to it's need for double precision. Asteroids isn't memory intensive but it uses the latest CPU extensions.
5) Message boards : Number crunching : Tasks available, but I am not getting them.
Message 71189
Posted 5 Aug 2024 by Dark Angel
In case anyone is wondering, I've been careful to not overcommit the machine and keep a couple of cores free no matter what I'm crunching.
6) Message boards : Number crunching : Tasks available, but I am not getting them.
Message 71188
Posted 5 Aug 2024 by Dark Angel
Just to cover my bases I suspended everything on my host machine and ran the benchmarks on the VM.
It's done now but still not grabbing any more work which makes sense given the expected completion for the tasks I have is still 38 days away. That's coming down several times faster than the time expired is going up, it will just take a while to level itself out I suspect.
7) Message boards : Number crunching : Tasks available, but I am not getting them.
Message 71182
Posted 4 Aug 2024 by Dark Angel
Finally got some units.
The last thing I did was to set the work cache to 10 and 10

Don't know if that was the winning combination or if someone did something at the server end.
8) Message boards : Number crunching : Tasks available, but I am not getting them.
Message 71181
Posted 4 Aug 2024 by Dark Angel
I'm avoiding hitting the update button precisely because of the backoff issue resetting every time.
Haven't seen any specific messages about storage space, and I got no work overnight.
This morning I wiped the client, rebooted, deleted all residual files, and installed fresh.

I enabled work_fetch_debug

4/08/2024 20:41:27 | | [work_fetch] ------- start work fetch state -------
4/08/2024 20:41:27 | | [work_fetch] target work buffer: 432000.00 + 432000.00 sec
4/08/2024 20:41:27 | | [work_fetch] --- project states ---
4/08/2024 20:41:27 | climateprediction.net | [work_fetch] REC 0.000 prio 0.000 can't request work: scheduler RPC backoff (2729.13 sec)
4/08/2024 20:41:27 | | [work_fetch] --- state for CPU ---
4/08/2024 20:41:27 | | [work_fetch] shortfall 3456000.00 nidle 4.00 saturated 0.00 busy 0.00
4/08/2024 20:41:27 | climateprediction.net | [work_fetch] share 0.000
4/08/2024 20:41:27 | | [work_fetch] ------- end work fetch state -------

I have the project set to 1000 priority. Previously it was set to 100. I don't recall ever setting this project to zero.
9) Message boards : Number crunching : Tasks available, but I am not getting them.
Message 71174
Posted 4 Aug 2024 by Dark Angel
I've expanded the virtual hdd and added another 60GB to the filesystem so I'll see how that goes.
10) Message boards : Number crunching : Tasks available, but I am not getting them.
Message 71172
Posted 4 Aug 2024 by Dark Angel
I only run Windows in a VM but have been able to get work in the past. Currently the server simply refuses to send me any work for this instance. I've tried several times, left it over night to sort itself out, reset the project, any number of reboots, and it still simply will not give me any work on the Windows instance. I managed to get a few Linux units on the host machine (OpenIFS work), those that have completed have been successful, but the Windows machine just sits idle.
Has something changed in the requirements? Have I not allocated enough RAM (4GB per core, four cores)? I get nothing from the logs in any useful time frame thanks to the hour backoff from the server every time.
11) Message boards : Number crunching : New Work Announcements 2024
Message 70699
Posted 3 Apr 2024 by Dark Angel
It's great there's new work, but it'd be even better if it would actually let machines requesting it HAVE some.
12) Message boards : Number crunching : New Work Announcements 2024
Message 70511
Posted 22 Feb 2024 by Dark Angel
I think I know what it is. I come back to this project intermittently and this time I did so with a VM that hadn't been on it before. Of course it put me on a limited number of work units until I return some successfully. I aborted the batch 1005 unit I had and the other three are still running. I'll just have to wait until I return at least one successful unit.
13) Message boards : Number crunching : New Work Announcements 2024
Message 70510
Posted 22 Feb 2024 by Dark Angel
Log from latest work fetch request (I let BOINC do it on it's own, I didn't click update so it would do the full time-out)

22/02/2024 01:54:14 | climateprediction.net | [css] running wah2_eas25_a33x_200512_24_1007_012268885_0 ( )
22/02/2024 01:54:14 | | [cpu_sched_debug] enforce_run_list: end
22/02/2024 01:54:26 | | choose_project(): 1708566866.014561
22/02/2024 01:54:26 | | [work_fetch] ------- start work fetch state -------
22/02/2024 01:54:26 | | [work_fetch] target work buffer: 259200.00 + 259200.00 sec
22/02/2024 01:54:26 | | [work_fetch] --- project states ---
22/02/2024 01:54:26 | climateprediction.net | [work_fetch] REC 721.330 prio -0.699 can't request work: scheduler RPC backoff (3570.09 sec)
22/02/2024 01:54:26 | | [work_fetch] --- state for CPU ---
22/02/2024 01:54:26 | | [work_fetch] shortfall 1031812.16 nidle 0.00 saturated 2431.98 busy 0.00
22/02/2024 01:54:26 | climateprediction.net | [work_fetch] share 0.000 project is backed off (resource backoff: 5007.51, inc 4800.00)
22/02/2024 01:54:26 | | [work_fetch] ------- end work fetch state -------
22/02/2024 01:54:26 | climateprediction.net | choose_project: scanning
22/02/2024 01:54:26 | climateprediction.net | skip: scheduler RPC backoff
22/02/2024 01:54:26 | | [work_fetch] No project chosen for work fetch
14) Message boards : Number crunching : New Work Announcements 2024
Message 70509
Posted 22 Feb 2024 by Dark Angel
For some reason it's not letting me have any.
I upped the number of CPU cores and RAM in my VM last night to do more, extended my work cache settings, and freed up disk space, but it's still not giving me any more than the three I currently have.


What's your client log say about the reason it's not requesting new work? There's usually some obvious-ish reason listed.


22/02/2024 00:52:38 | climateprediction.net | Sending scheduler request: To fetch work.
22/02/2024 00:52:38 | climateprediction.net | Requesting new tasks for CPU
22/02/2024 00:52:41 | climateprediction.net | Scheduler request completed: got 0 new tasks
22/02/2024 00:52:41 | climateprediction.net | No tasks sent
22/02/2024 00:52:41 | climateprediction.net | Project requested delay of 3636 seconds

That's all I'm getting for now, I'll enable a few more logging options and see if anything new comes up at the next update.
15) Message boards : Number crunching : New Work Announcements 2024
Message 70506
Posted 21 Feb 2024 by Dark Angel
So is there any word on when further new work will drop?
Server status currently showing 704 tasks ready to send, though doubtless that has dropped a bit since the last server update. I am guessing it may not be till next week that we get another of the batches that was mis configured sent out. The person who normally sends batches out is away and I don't know how much time Glenn has free to do this. If he doesn't have time it will have to wait till the person who normally does it is back.

Edit, 704 was from the newest batch. there were also a few retreads from 1001.


For some reason it's not letting me have any.
I upped the number of CPU cores and RAM in my VM last night to do more, extended my work cache settings, and freed up disk space, but it's still not giving me any more than the three I currently have.
16) Message boards : Number crunching : New Work Announcements 2024
Message 70501
Posted 21 Feb 2024 by Dark Angel
So is there any word on when further new work will drop?
17) Message boards : Number crunching : Uploads not working
Message 70488
Posted 20 Feb 2024 by Dark Angel
It's a simple and reversible change if you want to try it.

Network and Internet > Change Adaptor Settings > (select adaptor, I double clicked) > Properties > Configure > Advanced Tab > Large Send Offload (IPv4) > set to disabled > ok and close back out.

There's probably a quicker way to that menu but that got me there.
18) Message boards : Number crunching : Uploads not working
Message 70485
Posted 20 Feb 2024 by Dark Angel
Which network adapter are you using with VirtualBox out of interest? eno1 or wlp6s0? I checked my Win10 VM and I have wlp6s0 enabled (aka wireless), which strictly speaking I shouldn't have for bridging but I was lazy and selected the first option. Curious to know if the problem only affects the ethernet adapter, eno1.

I was also reading this article about problems with the default VirtualBox ethernet network adapter being the root cause:
https://petri.com/how-to-improve-network-performance-in-windows-virtualbox-guests/
It suggests installing the virtio-net adapter type for better performance. I was trying to check the driver was not junk when I saw your fix. I might still try this though.

Ok, I did a thing. It appears to have worked. In a sense I was right about not having this problem if they were linux tasks but only because if they were I wouldn't be using a VM to run them. The problem is a bug in the Virtualbox Bridged Network adaptor.
The change I made wasn't NAT, it was to the Windows network device driver settings.
Specifically I disabled “Large Send Offload (IPv4)” in the adaptor settings.
BOOM! Next upload file runs at 2000KBps instead of 3.2

I found the solution here in the last post: https://forums.virtualbox.org/viewtopic.php?t=110486


I have different adaptors to you, I think it's host specific. This is the selection I have:
(can't attach a picture so I'll have to type it out)
PCnet-PCI II(Am79C970A)
PCnet-FAST III (Am79C973)
Intel PRO/1000 MT Desktop (82540EM)
Intel PRO/1000 T Server (82543GC)
Intel PRO/1000 MT Server (82545EM)
Paravirtualised Network (virtio-net)

I found my VM refused to connect to the network with either of the PCnet adaptors or the Paravirtualised adaptor and had the same issue with all three Intel adaptors.
The adaptor name in my case is enp7s0 (Ethernet) with the options of wlp6s0 (WiFi) or ham0 (I run a VPN to my home server for my mobile devices)

I saw one upload go at over 3000KBps just now. That's rather a spectacular improvement over 3.2KBps for what amounts to a very simple change.[/img]
19) Message boards : Number crunching : Uploads not working
Message 70475
Posted 20 Feb 2024 by Dark Angel
Ok, I did a thing. It appears to have worked. In a sense I was right about not having this problem if they were linux tasks but only because if they were I wouldn't be using a VM to run them. The problem is a bug in the Virtualbox Bridged Network adaptor.
The change I made wasn't NAT, it was to the Windows network device driver settings.
Specifically I disabled “Large Send Offload (IPv4)” in the adaptor settings.
BOOM! Next upload file runs at 2000KBps instead of 3.2

I found the solution here in the last post: https://forums.virtualbox.org/viewtopic.php?t=110486
20) Message boards : Number crunching : Uploads not working
Message 70473
Posted 20 Feb 2024 by Dark Angel
I changed my VM networking from Bridged to NAT to see if it makes any difference but I'll have to wait until the server unlocks the files again.
I doubt that'll make any difference as that just affects how the VM appears on the local network. The outside world only sees your router IP address, not the local IPs.


I figured that, but thought it was worth a try. It just made my proxy logs harder to read.
Next 20

©2025 cpdn.org