climateprediction.net home page
If you have used VirtualBox for BOINC and have had issues, please can you share these?

If you have used VirtualBox for BOINC and have had issues, please can you share these?

Message boards : Number crunching : If you have used VirtualBox for BOINC and have had issues, please can you share these?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 52,932,477
RAC: 8,823
Message 67174 - Posted: 31 Dec 2022, 16:45:38 UTC
Last modified: 31 Dec 2022, 17:31:06 UTC

If you have had issues with VirtualBox (on ANY BOINC Project), please can you share your experience and the problems you faced?

Please NOTE that this post is not for solving issues, its about documenting what the issues are that people have experienced.
ID: 67174 · Report as offensive     Reply Quote
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 52,932,477
RAC: 8,823
Message 67178 - Posted: 31 Dec 2022, 17:22:20 UTC - in response to Message 67174.  

This is our experience of VirtualBox:

OS: Windows 10

1) When we installed VB around 5 years ago on dual Xeons, the server startup time went from around 5 mins to over 30 mins. When we removed VB it went back to normal. When we reinstalled VB it went back to over 30 mins

2) When we tried to use VirtualBox for LHC around 3 years ago it just constantly crashed. Eventually after days of trying to work out the issue we were advised that the VirtualBox that BOINC installs with, was not the correct version that LCH required. However when we finally got it "working" it just made our Workstation constantly go sluggish

3) Other times we have used VirtualBox, sorry I can't remember which Project it was, but more than a few times when we monitored our server we found that it had just frozen completely which then took a hard reboot to bring it back. We never had this issue before, or after, using VirtualBox on this server
ID: 67178 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4346
Credit: 16,535,294
RAC: 5,887
Message 67179 - Posted: 31 Dec 2022, 17:49:39 UTC

I eventually got LHC et al to work with VB. My issue was tasks would seem to start and then crash. eventually I saw a hint from Glen that let me change a config file that gets messed up. That on Linux. No experience with VB on other platforms.
ID: 67179 · Report as offensive     Reply Quote
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 52,932,477
RAC: 8,823
Message 67180 - Posted: 31 Dec 2022, 18:02:21 UTC - in response to Message 67179.  
Last modified: 31 Dec 2022, 18:07:17 UTC

Thank you Dave.

Hopefully people will realise that the feedback given here on VirtualBox is potentially very useful for CPDN.

Rather than being some randomly asked question.
ID: 67180 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 87
Credit: 32,981,759
RAC: 14,695
Message 67181 - Posted: 31 Dec 2022, 18:23:35 UTC
Last modified: 31 Dec 2022, 18:26:45 UTC

I haven't had major issues once it's setup. I use VB on Windows 10 (now 11) for LHC. The only notable thing is hardware virtualization support. If I upgrade my BIOS or reset CMOS for whatever reason, my motherboard (X470) defaults to disable hardware virtualization support. VB, even installed would no longer be detected by BOINC and any tasks relying on VB will fail immediately. This shouldn't be a problem for a server board or any board released in the past two to three years. AFAIK, hardware virtualization support is now generally enabled by default.

Edit: Might worth calling out that my experience with VB is relatively recent. My windows is using VB6 and VB7 had no problem on Linux either before I moved to native apps.
ID: 67181 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 11,999,430
RAC: 23,693
Message 67182 - Posted: 1 Jan 2023, 2:36:47 UTC

My only significant experience with VBox is getting a macOS Mojave VM set up to run those Mac only Hadley models. Could not get it to work on Ryzen 5900X but did get it working on an older Intel, i7-4790, both PCs are Windows 10. Ran stably for months crunching those models and I devoted and used a lot of resources to that VM. Windows and VBox upgrades didn't break it either. One problem I did have, which crashed the models, is time mismatch that BOINC was detecting but the clocks on both the VM and Windows10 matched up. I tried a few things but couldn't figure out the problem. The only thing that took care of the issue was disabling time sync in the macOS in the VM, so I just keep it off.

When it comes to Linux virtualization on Windows PCs, I'm a big fan of WSL2, which is part of Windows. It uses a lot less resources than VBox, and you can just close PowerShell and it runs in the background, you don't even see it, unlike VBox. I have used it extensively for BOINC and while it occasionally has its quirks, it runs very well. The quirks don't show up on CPDN, only on the complicated to set up projects like LHC ATLAS and Theory. Also, unlike regular Linux installations, WSL2 ones come with all of the 32-bit libraries needed to run Hadley models by default.
ID: 67182 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 806
Credit: 13,593,584
RAC: 7,495
Message 67188 - Posted: 1 Jan 2023, 16:25:13 UTC
Last modified: 1 Jan 2023, 16:26:26 UTC

@ncoded - have you tried VBox more recently that 3/5 yrs ago? That's quite a long time in software development terms. Might be worth another try.

Make sure you install VBox guest additions as well, that's needed for shared folder access.

The boinc bug that Dave refers to is in the systemctl file provided by recent boinc installs (at time of writing). I've posted about this before. The default systemctl file for boinc has an issue which stops VBox from accessing the /tmp filesystem that it needs. See: https://github.com/BOINC/boinc/issues/3355.

In the boinc client override file or systemctl file, you must have:

[Service]
ProtectSystem=full

Hopefully this will be fixed in later releases.

@AndreyOR : the timing issue is a new one on me, never seen that before. Had network problems before on a very old system which came down to the driver.

I'm also a big fan of WSL2, though of course this doesn't help with Vbox applications :)
ID: 67188 · Report as offensive     Reply Quote
gemini8

Send message
Joined: 4 Dec 15
Posts: 52
Credit: 2,182,959
RAC: 836
Message 67190 - Posted: 1 Jan 2023, 18:19:41 UTC

I'm running vbox just fine.
My machines have several VMs in which I do Boinc work, and additionally I can run vbox tasks on the main system.
I'm running vboxes on Linux, MacOS and Windows.
Inside the vboxes I usally use Debian.
- - - - - - - - - -
Greetings, Jens
ID: 67190 · Report as offensive     Reply Quote
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 52,932,477
RAC: 8,823
Message 67199 - Posted: 2 Jan 2023, 7:28:21 UTC
Last modified: 2 Jan 2023, 7:28:44 UTC

Yes it has been around 3 years since we last ran VirtualBox.

If I get time I will try and setup a workstation to test out VirtualBox on Windows. If that works 100% without any issues then I'll try installing and running on Linux.

Are you are going to use the same Version of VirtualBox as other BOINC Projects use? It's my understanding that you cannot run different versions of VirtualBox on the same computer.
ID: 67199 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 87
Credit: 32,981,759
RAC: 14,695
Message 67201 - Posted: 2 Jan 2023, 8:29:45 UTC - in response to Message 67199.  

Looks like I completely misunderstood the question. I thought it's about BOINC apps using VirtualBox, but instead you are planning to install Linux inside VB to run BOINC. Then you aren't really limited to VM, since any virtualization solution would do, including Hyper-V, VMware player or whatever else works well for Windows. WSL2 is also a choice.

Then I have my own curious question for others have done this. Since the guest system would not have access to the host state by definition, all of those user-interaction oriented feature simply won't work, right? Like detecting idling, exclusive applications, etc?
ID: 67201 · Report as offensive     Reply Quote
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 52,932,477
RAC: 8,823
Message 67202 - Posted: 2 Jan 2023, 8:32:37 UTC - in response to Message 67201.  
Last modified: 2 Jan 2023, 8:48:06 UTC

I have no plans to run any OS inside of VirtualBox.

The only reason it would be installed is for the BOINC Project(s) that require it.
ID: 67202 · Report as offensive     Reply Quote
wateroakley

Send message
Joined: 6 Aug 04
Posts: 185
Credit: 27,123,458
RAC: 3,218
Message 67221 - Posted: 2 Jan 2023, 15:44:19 UTC
Last modified: 2 Jan 2023, 16:04:23 UTC

We've been runing Virtualbox 6.1 with ubuntu 20.04.1 VM for 18 months and Mac Mojave VM for 9 months.

1. i7-3770 WIN10 host with 32GB of physical RAM.
When sizing the VM memory, the Windoze host will need a minimum of 8GB RAM reserved for Windoze to play in.
You'll need at least 24GB of physical RAM, peferably 32GB. In practice, 16GB physical RAM was insufficient.
The on-line tutorials all create a very small VM disc, far too small. If you need to make a much bigger linux VM disc, try GParted. About 40-100GB has been good so far.
The run-time cpdn crash issues were (in part) from Windoze updates unexpectedly rebooting the host. The answer is to pause Windoze updates for as long as you can and manage the WIN10 updates manually.

2. i7-8700 WIN10 host with 40GB of physical RAM
The ubuntu and Mojave VMs were migrated to this host .
VirtualBox 6.1 is happily running the migrated ubuntu 20.04.1 VM with BOINC/cpdn.
Increasing the 40GB vdi disc size in VirtualBox to 100GB created a 990GB VM disc partition. Oops, fat fingers had added an extra zero. There is no way to reduce the VM disc partition size in VirtualBox without creating a new vdi and moving the VM files around.
Set the number of VM cpus to N-1 physical cores. Tick 'Leave non-GPU tasks in memory while suspended'.
Edit: With the VM running 5 cpdn openIFS tasks, Resource Monitor reports 39GB Physical Memory in use and 1.6GB of cache.
The indicated requirements for upcoming openIFS tasks, says that more physical RAM will likely be needed.
Unfortunately, the new host does not like the Mac Mojave VM. Neither the migrated copy nor a fresh install get beyond starting the shell. I suspect it's unhappy with the EFI boot? Any suggestions on fixing this are welcome.

3. Quad-core Q9650 with 8GB physical RAM.
We got the ubuntu VM and BOINC/cpdn running. However, it ran one cpdn task, very slowly.
QED: don't bother.
ID: 67221 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 9 Dec 05
Posts: 111
Credit: 12,038,780
RAC: 1,393
Message 67222 - Posted: 2 Jan 2023, 16:12:02 UTC
Last modified: 2 Jan 2023, 16:12:27 UTC

I think that we need to remember that there are two ways to run VirtualBox environment with Boinc.
1. Create your own virtual computer and install Linux or some other OS on it. Then install Boinc and attach to CPDN and/or some other project.
2. Use project made virtual machine file that is run automatically by Boinc in host os (Windows, Linus etc.). Project defines what kind of CPU virtual environment is emulating, what OS to use and allocates necessary resources for it (CPU cores, memory, disk size etc.). This is how LHC is using the VirtualBox for Atlas, CMS and Theory tasks in Windows.

I think that the number 2 is the way CPDN is thinking of doing it?
ID: 67222 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4346
Credit: 16,535,294
RAC: 5,887
Message 67223 - Posted: 2 Jan 2023, 17:08:36 UTC
Last modified: 2 Jan 2023, 18:17:59 UTC

I think that the number 2 is the way CPDN is thinking of doing it?
Correct. Apart from enabling testing work for Windows under WINE at the same time as running Linux tasks in the host OS and vice versa my main use of VB has been to play with settings I don't want to risk in the host. Running the current batches in the VM I am getting about a 5% performance hit compared to running them natively. That is with Ubuntu20.10 as both host and guest.
ID: 67223 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 9 Dec 05
Posts: 111
Credit: 12,038,780
RAC: 1,393
Message 67229 - Posted: 2 Jan 2023, 22:29:29 UTC

I have a few observations from the LHC about running VB tasks:
1. Boinc (or Boinc Manager) that is running in the host OS does not know about the memory usage of an VB task, it relies only to the memory_rsc_bounds value that is set up for a task. So actual memory consumption monitoring needs to be done inside the VB. Windows task manager doesn't know the actual memory consumption of the virtual machine either, Task Manager just sees that memory is used but does not know by which process (or at least won't show it to you).
2. The task's actual progress is also not reported back to Boinc Manager. Boinc Manager uses a simulated progress that it shows for the user but it is wildly inaccurate at best of times and downright bogus sometimes. If user wants to see the actual progress, that has to be built into the VB application like LHC's Atlas tasks have.
ID: 67229 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 806
Credit: 13,593,584
RAC: 7,495
Message 67235 - Posted: 3 Jan 2023, 10:00:24 UTC - in response to Message 67201.  

Looks like I completely misunderstood the question. I thought it's about BOINC apps using VirtualBox, but instead you are planning to install Linux inside VB to run BOINC. Then you aren't really limited to VM, since any virtualization solution would do, including Hyper-V, VMware player or whatever else works well for Windows. WSL2 is also a choice.
Yes, it is about BOINC apps using virtualbox, because CPDN is hoping to use it to create a version of OpenIFS that will run on Windows using vbox.

The app running inside the virtual machine has access to a shared folder with the host for transferring files. As I understand it from what I've read, other functions like suspend etc work as normal.
ID: 67235 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 806
Credit: 13,593,584
RAC: 7,495
Message 67236 - Posted: 3 Jan 2023, 10:07:45 UTC - in response to Message 67229.  

I have a few observations from the LHC about running VB tasks:
1. Boinc (or Boinc Manager) that is running in the host OS does not know about the memory usage of an VB task, it relies only to the memory_rsc_bounds value that is set up for a task. So actual memory consumption monitoring needs to be done inside the VB. Windows task manager doesn't know the actual memory consumption of the virtual machine either, Task Manager just sees that memory is used but does not know by which process (or at least won't show it to you).
The virtual machine will be created with a memory partition big enough for the task's rsc_memory_bound plus an overhead for the minimal OS running in the virtual machine. I would have thought the boincmgr would see the memory used by the virtual machine itself which will be very close to the actual task usage. I've not looked in detail at this yet though there is obviously a memory overhead to using a VM. Exactly how much I don't know but OpenIFS is quite a high memory app anyway so the relatively small overhead probably won't matter.

2. The task's actual progress is also not reported back to Boinc Manager. Boinc Manager uses a simulated progress that it shows for the user but it is wildly inaccurate at best of times and downright bogus sometimes. If user wants to see the actual progress, that has to be built into the VB application like LHC's Atlas tasks have.
Ok, good point. I guess the VM has to open up a port to the host in order to communicate with boincmgr about progress. I'll look into this but if LHC have done it, I can ask them. Thx.
ID: 67236 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 87
Credit: 32,981,759
RAC: 14,695
Message 67262 - Posted: 3 Jan 2023, 22:00:53 UTC - in response to Message 67235.  

Yes, it is about BOINC apps using virtualbox, because CPDN is hoping to use it to create a version of OpenIFS that will run on Windows using vbox.

The app running inside the virtual machine has access to a shared folder with the host for transferring files. As I understand it from what I've read, other functions like suspend etc work as normal.

Thanks. Sorry I was missing this context. Then my first comment is more relevant and LHC has been working well for me with VB for at least a year running some WUs every day.
ID: 67262 · Report as offensive     Reply Quote
xii5ku

Send message
Joined: 27 Mar 21
Posts: 79
Credit: 78,302,757
RAC: 1,077
Message 67269 - Posted: 3 Jan 2023, 23:32:30 UTC
Last modified: 3 Jan 2023, 23:35:07 UTC

Pitfalls with vboxwrapper based applications which come to my mind:

– Already mentioned by others, virtualization support must be enabled in the BIOS.

– AFAIK you can have only one hypervisor (VirtualBox, KVM, HyperV…) active at a time.

– Linux: Does the boinc user need to be made member of the vboxusers group? I think so, but may be wrong.

– Cosmology@home's camb_boinc2docker and Rosetta@home's rosetta python projects tasks quite frequently end up in "Postponed: VM job unmanageable, restarting later" state. (I don't know if lhc@home's non-native applications are similarly affected, it's too long ago that I ran them.) IME, the only practical way to deal with such tasks is to abort them (and lose all the CPU time which was already spent with such a task before it stalled). I have read in project forums that the reason for these stalls may be that these projects supply outdated buggy vboxwrapper versions, and that a more up to date vboxwrapper might fix it. I haven't verified this claim.

– Native tasks are by default started at lowest possible scheduling priority (IOW, at highest "nice" value). That's apparently not the case with the vbox WMs: When I last ran such a project, the VMs ran at "normal" priority.

– Due to the previous issue, probably combined with other reasons, a computer running such tasks can easily become sluggish, IOW inconvenient or even outright hard to interact with.

– RAM footprint: Last time I checked, the VM has got a large amount of RAM allocated for the whole duration of the task (sum of peak RAM usage of the application, RAM usage of the guest OS, plus a safety margin). That's very wasteful, even more so on high core count computers which could run many of such tasks in parallel. With native applications, all of the extra RAM beyond the application's average RAM consumption would, most of the time, be usable by the operating system for its filesystem cache, et cetera.

– Disk footprint: The VM images take a decent amount of space.

– Network transfers: The VM images need to be downloaded.

– Network transfers control taken away from the boinc client: All vboxwrapper based applications which I have encountered so far perform network transfers from within the VM, completely outside of the control of the boinc client.

– Kernel driver required: VirtualBox hooks deeply into the kernel. Many users won't care, but I for example don't allow so-called out-of-tree drivers on my main computer; I accept such software only on those of my computers which have reduced reliability requirements, notably computers which are dedicated to nothing but Distributed Computing.
ID: 67269 · Report as offensive     Reply Quote
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 52,932,477
RAC: 8,823
Message 67270 - Posted: 3 Jan 2023, 23:56:55 UTC
Last modified: 4 Jan 2023, 0:39:13 UTC

<removed>
ID: 67270 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : If you have used VirtualBox for BOINC and have had issues, please can you share these?

©2024 climateprediction.net