climateprediction.net home page
*** Running 32bit CPDN from 64bit Linux - Discussion ***

*** Running 32bit CPDN from 64bit Linux - Discussion ***

Questions and Answers : Unix/Linux : *** Running 32bit CPDN from 64bit Linux - Discussion ***
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 18 · Next

AuthorMessage
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 66102 - Posted: 13 Sep 2022, 19:34:23 UTC - in response to Message 66099.  

I believe there are at least a few other WSL2 users here too. Maybe it is your hacking. :-)

WSL2 works great for me with CPDN (Ubuntu 20.04), except that after a couple of weeks it just stops and I have to reboot. I think others have noted the same problem.
But I run my machines 24/7. If you reboot anyway, you probably will not notice that problem.

However, you then have the problem that CPDN will not always survive a reboot. You really should pause it before rebooting, and it should work.
ID: 66102 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,468,868
RAC: 3,594
Message 66103 - Posted: 14 Sep 2022, 0:43:40 UTC - in response to Message 66101.  

Hopefully soon there will be no need for this thread.

That would be fantastic! Hopefully your effort will be successful.
ID: 66103 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 11,789,841
RAC: 19,380
Message 66104 - Posted: 14 Sep 2022, 6:42:40 UTC - in response to Message 66102.  
Last modified: 14 Sep 2022, 6:52:29 UTC

Jim1348,
I haven't noticed the problem you describe. I've also ran WSL2 24/7, the longest that I can remember is about 4 weeks with no issues. Have you noticed it with different computers? It's not frequent that I run it for extended periods of time as I'd do Windows update and reboot between batches of CPDN tasks and if no CPDN tasks, reboot whenever the updates come up. So it's possible I haven't run into it.
ID: 66104 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 66105 - Posted: 14 Sep 2022, 8:08:12 UTC - in response to Message 66104.  

Have you noticed it with different computers?
I have only one Win10 machine, but it has been through several updates (also updates on the Linux side), and the problem still persists for over a year.
https://www.cpdn.org/forum_thread.php?id=9025&postid=63462#63462

The machine is stable otherwise.
ID: 66105 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 11,789,841
RAC: 19,380
Message 66106 - Posted: 14 Sep 2022, 8:41:40 UTC - in response to Message 66101.  

Glenn,
Here's where I got the single OS per model tytpe: https://www.climateprediction.net/getting-started/support/technical-faq/#why_only_available_on_one_operating_system. What you describe fits with that even more, single OS (Linux) for all models. You're looking to make the VM thing be easier to use, like Rosetta and LHC, which is good. Right now we have to manually set up some kind of virtualization, WSL2, Hyper-V, VirtualBox, to run the models that aren't for our default OSs.

Thank you for clarifying the OpenIFS vs. Hadley, I suspected that they're not the same and that Hadley will likely still be around.

I don't think the failure rate needs to be that high even right now. The project could alert the new users with a message at set up with a link to the message board post that has the instructions on how to obtain the proper libraries. The project could also restrict the computers that failed a small amount of tasks in a row to only 1 task a day until that computer can start showing that it can produce successful results. We have computers around that have failed hundreds and even thousands of tasks. For example: https://www.cpdn.org/results.php?hostid=1517479. Surely there are ways to restrict computers that can't produce successful results.
ID: 66106 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4341
Credit: 16,497,105
RAC: 6,447
Message 66107 - Posted: 14 Sep 2022, 14:28:31 UTC - in response to Message 66106.  

Surely there are ways to restrict computers that can't produce successful results.
There was a time when Andy would manually set the maximum number of work units per day on the serial killers that crashed absolutely everything to -1 (0 being no restriction.) I don't know if there is an easy way to automate this but I would like to see it done, particularly for the computers with missing libraries. I suspect that this might be a bigger problem now than it used to be with Science United where users have virtually no control over what projects their boxes run and unless they sign up to projects via the web sites as well, they won't have access to the forums to tell them how to sort things out.
ID: 66107 · Report as offensive     Reply Quote
Dark Angel

Send message
Joined: 31 May 18
Posts: 43
Credit: 4,305,745
RAC: 4,146
Message 66148 - Posted: 29 Sep 2022, 1:35:36 UTC

Perhaps if there was a test file users could download that depends on the same libraries as the actual work files users could check their systems against it to verify everything installed properly before wasting project time and bombing actual work units.
Personally I'd like to be able to grab say an application file and run ldd on it to verify I got everything installed correctly rather than waiting until I start seeing errors and ending up aborting hundreds of units until I get it fixed which is what has happened previously.
ID: 66148 · Report as offensive     Reply Quote
Dark Angel

Send message
Joined: 31 May 18
Posts: 43
Credit: 4,305,745
RAC: 4,146
Message 66149 - Posted: 29 Sep 2022, 2:17:35 UTC

Just scored a work unit! W00T!
At least now I can confirm I have all the required libs installed.
ID: 66149 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 11,789,841
RAC: 19,380
Message 66291 - Posted: 7 Nov 2022, 2:39:24 UTC - in response to Message 66290.  
Last modified: 7 Nov 2022, 2:49:36 UTC

I've never used Arch Linux but from looking around I believe these are the 2 packages you'll need: lib32-gcc-libs and lib32-glibc. They seem to have all of the required shared libraries to run all of the currently available Linux models. You might only need the first one but can't be sure without looking at it/testing. To install them I believe you'd run the command:
sudo pacman -S lib32-gcc-libs lib32-glibc

Unfortunately there isn't any work available to test it out. The best you can do is install the 2 packages, be always connected, and wait.
ID: 66291 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4341
Credit: 16,497,105
RAC: 6,447
Message 66292 - Posted: 7 Nov 2022, 6:10:40 UTC - in response to Message 66291.  

Might be worth looking at the BOINC message boards. WCG used to have 32bit work that required the libraries. Not sure if it still does. Searching the BOINC boards may throw up which other projects do. I believe there are a couple but can't remember which.
ID: 66292 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1055
Credit: 16,515,145
RAC: 835
Message 66297 - Posted: 8 Nov 2022, 18:27:28 UTC - in response to Message 66296.  

With a bit of luck, my Arch installation might now be CPDN-ready. I'll wait for some work, and see what happens.


You may have quite wait. My last (but one) work unit was at the end of this July. It completed successfully.
Since then, I have received only one work unit (yesterday), and it failed like this:

<core_client_version>7.20.2</core_client_version>
<![CDATA[
<message>
process exited with code 22 (0x16, -234)</message>
<stderr_txt>

Model crashed: ATM_DYN : NEGATIVE THETA DETECTED.                                                                                                                                                                                                                              tmp/xnnuj.pipe_dummy                                                            

Model crashed: ATM_DYN : NEGATIVE THETA DETECTED.                                                                                                                                                                                                                              tmp/xnnuj.pipe_dummy                                                            

Model crashed: ATM_DYN : NEGATIVE THETA DETECTED.                                                                                                                                                                                                                              tmp/xnnuj.pipe_dummy                                                            

Model crashed: ATM_DYN : NEGATIVE THETA DETECTED.                                                                                                                                                                                                                              tmp/xnnuj.pipe_dummy                                                            

Model crashed: ATM_DYN : NEGATIVE THETA DETECTED.                                                                                                                                                                                                                              tmp/xnnuj.pipe_dummy                                                            

Model crashed: ATM_DYN : NEGATIVE THETA DETECTED.                                                                                                                                                                                                                              tmp/xnnuj.pipe_dummy                                                            
Sorry, too many model crashes! :-(
07:28:41 (795039): called boinc_finish(22)

</stderr_txt>
]]>


I was the last of five users to fail this way.
ID: 66297 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4341
Credit: 16,497,105
RAC: 6,447
Message 66298 - Posted: 8 Nov 2022, 18:39:47 UTC

I tried to get some of the four OpenIFS tasks from latest testing batch. - Got a server (feeder not running) error. Informed Andy but they were all gone when I got up this morning and saw the message that the problem had been fixed. I am hoping that most researchers will be able to swap to the OpenIFS from the Met Office models as that would get rid of the issue and with the size of recent models, any computer old enough to still be 32bit really isn't up to the job of running recent tasks.
ID: 66298 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1055
Credit: 16,515,145
RAC: 835
Message 66299 - Posted: 8 Nov 2022, 19:51:39 UTC - in response to Message 66298.  

I am hoping that most researchers will be able to swap to the OpenIFS from the Met Office models as that would get rid of the issue and with the size of recent models, any computer old enough to still be 32bit really isn't up to the job of running recent tasks.


My machine is certainly 64-bit. But I also have the necessary 32-bit compatibility libraries to run MetOffice work units.

Computer 1511241

Total credit 	6,152,503
Average credit 	0.83
CPU type 	GenuineIntel
Intel(R) Xeon(R) W-2245 CPU @ 3.90GHz [Family 6 Model 85 Stepping 7]
Number of processors 	16
Operating System 	Linux Red Hat Enterprise Linux
Red Hat Enterprise Linux 8.6 (Ootpa) [4.18.0-372.26.1.el8_6.x86_64|libc 2.28]
BOINC version 	7.20.2
Memory 	62.28 GB
Cache 	16896 KB
Swap space 	15.62 GB
Total disk space 	488.04 GB  [dedicated partition for Boinc]
Free Disk Space 	479.28 GB  [dedicated partition for Boinc]
Measured floating point speed 	6.13 billion ops/sec
Measured integer speed 	26.09 billion ops/sec
Average upload rate 	153.25 KB/sec
Average download rate 	8479.32 KB/sec


Must I do anything, as a regular user (not in the test group) to run these or wilt they just appear and start running when they become available?
ID: 66299 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 66300 - Posted: 8 Nov 2022, 20:24:42 UTC - in response to Message 66299.  

When they do show up, it will be "business as usual".
:)
ID: 66300 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1055
Credit: 16,515,145
RAC: 835
Message 66304 - Posted: 8 Nov 2022, 23:36:58 UTC - in response to Message 66300.  

OK, so if my 64 Gig of RAM is big enough, I should be able to run more than one at a time?
ID: 66304 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 66307 - Posted: 9 Nov 2022, 1:30:12 UTC - in response to Message 66304.  

Apparently the server will decide.
ID: 66307 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4341
Credit: 16,497,105
RAC: 6,447
Message 66309 - Posted: 9 Nov 2022, 9:41:48 UTC

OK, so if my 64 Gig of RAM is big enough, I should be able to run more than one at a time?
Should be no problems with that. One of the testing batches which were just single core, and used up to about 5GB max I was able to run 4 at once on my now dead laptop which only had 8GB or RAM. Because of using swap, even with an SSD it did slow them down a lot however.
ID: 66309 · Report as offensive     Reply Quote
KAMasud

Send message
Joined: 6 Oct 06
Posts: 204
Credit: 7,608,986
RAC: 0
Message 66312 - Posted: 9 Nov 2022, 10:54:11 UTC

The page at Boinc, installing Boinc on Linux. There is something wrong. They have changed the contents of the page recently. The command "apt" has been changed to "aptitude". Linux Mint recognises this command but cannot find the packages boinc-manager boinc-client. However, this command is not recognised by Linux Zorin. So I tried the old command "apt-get" which is still recognised but it still is unable to locate the packages boinc-manager boinc-client.
Both Mint and Zorin are the latest. Some help is required.
ID: 66312 · Report as offensive     Reply Quote
Profile Alan K

Send message
Joined: 22 Feb 06
Posts: 484
Credit: 29,590,874
RAC: 1,482
Message 66326 - Posted: 9 Nov 2022, 23:23:23 UTC - in response to Message 66312.  

Try running "apt update" before the command for to install BOINC.
ID: 66326 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 11,789,841
RAC: 19,380
Message 66331 - Posted: 10 Nov 2022, 8:10:26 UTC - in response to Message 66312.  

Unless I'm missing something, it doesn't look like BOINC is available from a repository for Mint or Zorin. That'd explain why those packages aren't found. You might have to try another approach like building it from source.
ID: 66331 · Report as offensive     Reply Quote
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 18 · Next

Questions and Answers : Unix/Linux : *** Running 32bit CPDN from 64bit Linux - Discussion ***

©2024 climateprediction.net