climateprediction.net home page
fedora 30 64 bit

fedora 30 64 bit

Questions and Answers : Unix/Linux : fedora 30 64 bit
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
nairb

Send message
Joined: 3 Sep 04
Posts: 58
Credit: 1,503,572
RAC: 6,871
Message 62063 - Posted: 29 Jan 2020, 18:13:36 UTC

Has anybody tried using fedora 30 workstation for climate w/u. Its running on an intel i3 processor. Does the download of new w/u automatically select 64bit models or will I need to get some of those 32bit libs?
Thats assuming there are some 64bit linux to be had.

ta
Nairb
ID: 62063 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 1942
Credit: 41,861,826
RAC: 19,854
Message 62065 - Posted: 29 Jan 2020, 19:03:21 UTC - in response to Message 62063.  

There aren't any 64bit models at this time. While one of the applications is planned to be 64bit, no timetable for release of that model has been announced or hinted at. So, to obtain work, the proper 32bit libraries need to be installed, and on Fedora 30, I'm not sure what they are.
ID: 62065 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7262
Credit: 23,235,750
RAC: 5,211
Message 62066 - Posted: 29 Jan 2020, 20:23:47 UTC - in response to Message 62063.  

There's a post at the top of this Linux section, about what's need for other distro's, which may give you some clues.
And when you get it working, please post back about what's needed, and how to get it/them. Then we can add that to the list.

Also, that computer could do with more memory. These models are big, and the 64 bit ones mentioned in the previous post needed a bit over 5 Gigs per model, when we ran some tests early last year.
ID: 62066 · Report as offensive     Reply Quote
nairb

Send message
Joined: 3 Sep 04
Posts: 58
Credit: 1,503,572
RAC: 6,871
Message 62071 - Posted: 3 Feb 2020, 0:06:03 UTC

An update on using fedora linux
I finally grabbed some w/u’s. So I suspended them before they could start and did the following to try and find out which libs the climate app used.

Here is the linux that I am using. Fedora 30 workstation, 64 bit
uname -mr
5.4.12-100.fc30.x86_64 x86_64

The following is the output from ldd
ldd hadam4_8.52_i686-pc-linux-gnu  
       linux-gate.so.1 (0xf7efa000)
       libpthread.so.0 => /lib/libpthread.so.0 (0xf7ea7000)
       libdl.so.2 => /lib/libdl.so.2 (0xf7ea1000)
       libstdc++.so.6 => not found
       libm.so.6 => /lib/libm.so.6 (0xf7dcf000)
       libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xf7db1000)
       libc.so.6 => /lib/libc.so.6 (0xf7c0a000)
       /lib/ld-linux.so.2 (0xf7efb000)

SO, it looked like the libstdc++.so.6 is missing. So I installed it
And another way of finding (some?) of the libs
(Part) of the output of readelf -d hadam4_8.52_i686-pc-linux-gnu

0x00000001 (NEEDED) Shared library: [libpthread.so.0]
0x00000001 (NEEDED) Shared library: [libdl.so.2]
0x00000001 (NEEDED) Shared library: [libstdc++.so.6]
0x00000001 (NEEDED) Shared library: [libm.so.6]
0x00000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x00000001 (NEEDED) Shared library: [libc.so.6]

It looked like I had all the libs installed (hopefully).

I un-suspended one of the tasks And…. So far its done almost 1% without falling over.
Its a simple and cheap upgrade to an I5 processor or even one of the I7 ones and a bit more ram. If this test with fedora works then a better desktop is the way forward.
ID: 62071 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 1942
Credit: 41,861,826
RAC: 19,854
Message 62072 - Posted: 3 Feb 2020, 4:15:54 UTC - in response to Message 62071.  

With your i3 where there are two physical cores (4 logical cores with hyperthreading) and only 3 MB total of L3 cache, I would definitely not run 3 at a time, and 2 at a time will be very slow. Perhaps running only one at a time would make the most sense. These N216 tasks are big cache hogs.
ID: 62072 · Report as offensive     Reply Quote
nairb

Send message
Joined: 3 Sep 04
Posts: 58
Credit: 1,503,572
RAC: 6,871
Message 62080 - Posted: 5 Feb 2020, 23:25:16 UTC - in response to Message 62072.  

So far it is on course to do a w/u in 16 days. I have let 2 models run at once. And some seti w/u also. I does slow down will all 4 logical running. The m/b will take an I5 cpu with 4 cores, so I will update to one of these. And in the meantime just run the 2 climate models.
At 20% there are still no trickle up's.
But this test was to see if Fedora 30 would run the 32 bit apps. This is the last version of fedora that will support 32 bits. But it should be good for a few years.
Fedora 30 does run seti, einstein & rosetta fine. So far its doing good on climate also.
ID: 62080 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 1942
Credit: 41,861,826
RAC: 19,854
Message 62081 - Posted: 6 Feb 2020, 0:09:04 UTC - in response to Message 62080.  

So far it is on course to do a w/u in 16 days. I have let 2 models run at once. And some seti w/u also. I does slow down will all 4 logical running. The m/b will take an I5 cpu with 4 cores, so I will update to one of these. And in the meantime just run the 2 climate models.
At 20% there are still no trickle up's.
But this test was to see if Fedora 30 would run the 32 bit apps. This is the last version of fedora that will support 32 bits. But it should be good for a few years.
Fedora 30 does run seti, einstein & rosetta fine. So far its doing good on climate also.

Great! Glad it is running fine. There are 4 months in these N216 models and a trickle up after each month, so at about 25% you should see the first one.
ID: 62081 · Report as offensive     Reply Quote
nairb

Send message
Joined: 3 Sep 04
Posts: 58
Credit: 1,503,572
RAC: 6,871
Message 62085 - Posted: 7 Feb 2020, 0:55:07 UTC

Yup, just after 25% the first model produced a trickle up. It will be interesting to see how much faster an 4 core I5 cpu will be. It still seems fine for memory. Very little swap space used.
Now its fingers crossed the model does not crash.
ID: 62085 · Report as offensive     Reply Quote
nairb

Send message
Joined: 3 Sep 04
Posts: 58
Credit: 1,503,572
RAC: 6,871
Message 62142 - Posted: 19 Feb 2020, 22:56:48 UTC

So at 99.81% complete I was getting ready for a celebration..... I wander off for a couple of hrs and when I return the blasted w/u had error-ed. It looks like one of the libs was missing.
The error was libnsl.so.1 was missing. I wonder how many more are missing as well.

A quick yum install found and installed the lib. I turns out it has no-longer been included in fedora since release 28. Shame I had to wait 16 odd days to find out.

So anybody using fedora 30 needs to install this lib also. (libnsl.so.1) not sure if I should look for a 32bit version as well.

In a couple of days the second w/u should finish....... lets hope it finished as it should
ID: 62142 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 1942
Credit: 41,861,826
RAC: 19,854
Message 62143 - Posted: 20 Feb 2020, 4:05:45 UTC - in response to Message 62142.  
Last modified: 20 Feb 2020, 4:16:00 UTC

So at 99.81% complete I was getting ready for a celebration..... I wander off for a couple of hrs and when I return the blasted w/u had error-ed. It looks like one of the libs was missing.
The error was libnsl.so.1 was missing. I wonder how many more are missing as well.

A quick yum install found and installed the lib. I turns out it has no-longer been included in fedora since release 28. Shame I had to wait 16 odd days to find out.

So anybody using fedora 30 needs to install this lib also. (libnsl.so.1) not sure if I should look for a 32bit version as well.

In a couple of days the second w/u should finish....... lets hope it finished as it should

I probably should have suggested this before since we haven't had much in the way of fedora people posting on the message board for a long time. You can do an ldd (or sudo ldd) on the executables in the .../projects/climateprediction.net directory to make sure the requirements are satisfied. For hadam4h models, those executables would be

hadam4_8.52_i686-pc-linux-gnu
hadam4_se_8.52_i686-pc-linux-gnu.so
hadam4_um_8.52_i686-pc-linux-gnu

Edit...looking at stderr on your failed task page, it looks like that error was thrown 4 times, once for each monthly upload. So, even if all the lib requirements are now satisfied, for your other task nearing completion, the first three months of upload files will be missing. It will likely upload the 4th month, but give an overall task failure because of the other missing monthly upload files. Sorry, that sucks.
ID: 62143 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2792
Credit: 3,659,580
RAC: 11,331
Message 62144 - Posted: 20 Feb 2020, 8:05:29 UTC

I probably should have suggested this before since we haven't had much in the way of fedora people posting on the message board for a long time. You can do an ldd (or sudo ldd) on the executables in the .../projects/climateprediction.net directory to make sure the requirements are satisfied. For hadam4h models, those executables would be


Will add this to the post with instructions for the different distros. As for doing a sudo ldd on the executables, I have taken to doing this whenever I do a fresh install of my distro on the basis that I don't trust the blighters to not miss out something I want/need.
ID: 62144 · Report as offensive     Reply Quote
nairb

Send message
Joined: 3 Sep 04
Posts: 58
Credit: 1,503,572
RAC: 6,871
Message 62148 - Posted: 21 Feb 2020, 4:33:13 UTC

Thanks for the info. I was a bit lazy in not doing a ldd on all the executables. It turned out that the hadam4_se_8.52_i686-pc-linux-gnu.so exe was missing the lib - libnsl.so.1.
I did notice that there was no data being uploaded during the trickle-up times but since the model did not crash I thought all was working ok.
I aborted the other model at 84%. There was little point in letting it finish and fail.
I have started the 3rd model and will watch and see at 25% if there is data produced to be uploaded.
I suppose I could have followed the instruction on using Ubuntu but I have used Fedora from the beginning.... starting with redhat 5.2 and on to fedora 4...... I still have machines with those o/s on. Fedora seems to be ok with several of the other boinc projects.

Fingers crossed for the next model.............
ID: 62148 · Report as offensive     Reply Quote
nairb

Send message
Joined: 3 Sep 04
Posts: 58
Credit: 1,503,572
RAC: 6,871
Message 62152 - Posted: 24 Feb 2020, 21:35:42 UTC

One model has produced a 137mb upload file for the first trickle...... Better than last time. Dare I hope??.
ID: 62152 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 1942
Credit: 41,861,826
RAC: 19,854
Message 62153 - Posted: 24 Feb 2020, 22:41:39 UTC - in response to Message 62152.  

Yep. That sounds right.
ID: 62153 · Report as offensive     Reply Quote
nairb

Send message
Joined: 3 Sep 04
Posts: 58
Credit: 1,503,572
RAC: 6,871
Message 62176 - Posted: 2 Mar 2020, 0:59:09 UTC

Oh dear... I needed to restart the desktop. All of the 3 models resumed - then one failed with computing error. It had only been running for 3 days. But I checked how successful the remaining models had been with other computers. Gulp....... not one had been successful.

I needed to restart the desktop machine after an software update which included changes/updates to the fc30 kernel. Maybe this is not a
wise thing to do when a model has started. I doubt this is a fedora 30 problem tho.
ID: 62176 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 382
Credit: 3,690,501
RAC: 0
Message 62177 - Posted: 2 Mar 2020, 2:10:13 UTC - in response to Message 62176.  

This model did not like to get interrupted:

UK Met Office HadAM4 at N144 resolution v8.08 i686-pc-linux-gnu

But the v8.09 fixed that problem. I have had no trouble with the N216 models.

I run RHEL6.10.
ID: 62177 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 1942
Credit: 41,861,826
RAC: 19,854
Message 62178 - Posted: 2 Mar 2020, 3:05:22 UTC - in response to Message 62176.  

Oh dear... I needed to restart the desktop. All of the 3 models resumed - then one failed with computing error. It had only been running for 3 days. But I checked how successful the remaining models had been with other computers. Gulp....... not one had been successful.

I needed to restart the desktop machine after an software update which included changes/updates to the fc30 kernel. Maybe this is not a
wise thing to do when a model has started. I doubt this is a fedora 30 problem tho.


Sometimes when suspending and exiting boinc, or just exiting boinc, a file will be left in the slots/x directory/directories where x is a number for the models being run The filename will have a "finished" string as part of the name. It's not supposed to be there, and if it is when the model starts back up, it will self abort. It's a bug in the boinc code, and it doesn't just affect cpdn. Over the years, the error message in stderr has changed somewhat, but the problem still exists. I've taken to checking the slots directories after I exit boinc to make sure that file does not exist in those directories. It's king of a pain, but I've lost several models over the last several years from this bug.
ID: 62178 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2792
Credit: 3,659,580
RAC: 11,331
Message 62179 - Posted: 2 Mar 2020, 7:07:45 UTC - in response to Message 62176.  

Oh dear... I needed to restart the desktop. All of the 3 models resumed - then one failed with computing error. It had only been running for 3 days. But I checked how successful the remaining models had been with other computers. Gulp....... not one had been successful.

I needed to restart the desktop machine after an software update which included changes/updates to the fc30 kernel. Maybe this is not a
wise thing to do when a model has started. I doubt this is a fedora 30 problem tho.


I will start checking in the slots directory George. Somehow I missed that one though pretty sure you must have posted about it before.

In the past, my impression has been that models are more likely to crash after a kernel update and I try and wait till there are no tasks running before doing them but for me, the hadam4 tasks seem less prone to crashing after restarts than some of the older task types so don't have enough data for them. With regards to the others all having failed on previous computers, tasks here don't go out to more than one host unless they have crashed on the earlier ones. _0 at the end of the task name indicates you are the first person to get that task, _1 the second etc.
ID: 62179 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 1942
Credit: 41,861,826
RAC: 19,854
Message 62180 - Posted: 2 Mar 2020, 20:22:42 UTC - in response to Message 62179.  
Last modified: 2 Mar 2020, 20:32:24 UTC

Here are 8, for different tasks, for one of my Linux PCs. I've probably saved over a dozen across a number of PCs over the last year by deleting the problematic files. Of course I've forgotten to do it a couple times and those were some of the failures. Most of the time cleanly shutting down boinc does not leave the problematic "finished" file in the slots directories. But once in awhile it does, and it might be in multiple slots directories with that shutdown, and thus multiple failures if not cleaned up. I was running an older version of boinc and the line in stderr for the failure is "finish file present too long". The error with a much more recent version of boinc in nairb's crash was "Process still present 5 min after writing finish file; aborting".

https://www.cpdn.org/result.php?resultid=21871647
https://www.cpdn.org/result.php?resultid=21872302
https://www.cpdn.org/result.php?resultid=21782968
https://www.cpdn.org/result.php?resultid=21782986
https://www.cpdn.org/result.php?resultid=21744149
https://www.cpdn.org/result.php?resultid=21744184
https://www.cpdn.org/result.php?resultid=21744785
https://www.cpdn.org/result.php?resultid=21743766
ID: 62180 · Report as offensive     Reply Quote
Profile Alan K

Send message
Joined: 22 Feb 06
Posts: 347
Credit: 16,816,660
RAC: 1,894
Message 62181 - Posted: 2 Mar 2020, 23:07:48 UTC - in response to Message 62178.  

This might explain why the 3 N216 models I was running crashed after a UBUNTU upgrade a few days ago. Very frustrating as they had got to 95%!!!
ID: 62181 · Report as offensive     Reply Quote
1 · 2 · Next

Questions and Answers : Unix/Linux : fedora 30 64 bit

©2020 climateprediction.net