climateprediction.net home page
Weird Problem on one Host

Weird Problem on one Host

Message boards : Number crunching : Weird Problem on one Host
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user3434
Avatar

Send message
Joined: 30 Aug 04
Posts: 77
Credit: 1,785,934
RAC: 0
Message 35835 - Posted: 6 Jan 2009, 0:31:09 UTC
Last modified: 6 Jan 2009, 1:29:44 UTC

I recently restarted CPDN on all my Systems, but on one System I absolutely can\'t seem to be able to run CPDN (?!)

Link to Host
Failed Tasks

From what I can read off the stderr.out and what I witnessed controlling the Client startup, it looks like BOINC (Linux 5.10.45 x86_64) isn\'t able to physically launch the Client (??)
The same occurred when no other Project was allowed to run and CPDN was launching a single Model.

The only difference between all my other Linux hosts (almost identical Hardware setup) is that this specific Host is using the new Fedora 10 x86_64, while all others still run Fedora 8 or 9.

I\'ve went so far to give the Projects on that Hosts full Read/Write permissions for a test (chmod 777 for entire BOINC directory) but to no effect.
Also resetting the Project and also de-/reattaching to the Project did not change the situation.

Any Ideas ? ...because other than looking at a potential hardware defect, I\'m out of ideas here.
For now, I\'ve stopped running CPDN on that Host as it only trashes model after model until reaching its daily quota :(
Scientific Network : 44800 MHz - 77824 MB - 1970 GB
ID: 35835 · Report as offensive     Reply Quote
3rkko

Send message
Joined: 12 Feb 08
Posts: 66
Credit: 4,877,652
RAC: 0
Message 35838 - Posted: 6 Jan 2009, 1:41:51 UTC - in response to Message 35835.  

Do you have 32bit libraries installed?
http://boinc.ssl.berkeley.edu/wiki/Installing_on_Linux
ID: 35838 · Report as offensive     Reply Quote
old_user3434
Avatar

Send message
Joined: 30 Aug 04
Posts: 77
Credit: 1,785,934
RAC: 0
Message 35841 - Posted: 6 Jan 2009, 5:39:20 UTC - in response to Message 35838.  
Last modified: 6 Jan 2009, 5:41:08 UTC

Do you have 32bit libraries installed?
http://boinc.ssl.berkeley.edu/wiki/Installing_on_Linux


No, as the CPU, OS and BOINC are all native x86_64...

Weird that identical configurations work perfect using older releases of Fedora Linux x86_64.
All of these are root Systems operating in a green zone, so file access restrictions are not a factor.
Scientific Network : 44800 MHz - 77824 MB - 1970 GB
ID: 35841 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 35842 - Posted: 6 Jan 2009, 5:54:41 UTC

No, as the CPU, OS and BOINC are all native x86_64...

The science apps here, though, are all 32 bits, with some server code for one lot, (slab models), to send these to 64 bit computers that ask for 64 bit apps.

ID: 35842 · Report as offensive     Reply Quote
old_user3434
Avatar

Send message
Joined: 30 Aug 04
Posts: 77
Credit: 1,785,934
RAC: 0
Message 35844 - Posted: 6 Jan 2009, 12:37:53 UTC - in response to Message 35842.  

No, as the CPU, OS and BOINC are all native x86_64...

The science apps here, though, are all 32 bits, with some server code for one lot, (slab models), to send these to 64 bit computers that ask for 64 bit apps.



Hm, so far none of my Systems had troubles accepting 32bit Binaries and 22 other 64bit systems are happily crunching away. Everything is working as expected with the exception of that one System.

Could you possibly scan once trough the Host list and see if there are other Fedora 10 x86_64 Hosts operating normally ?
(that would be indication enough for me that this failure is a Host-related problem here on my side)
Scientific Network : 44800 MHz - 77824 MB - 1970 GB
ID: 35844 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2170
Credit: 64,557,883
RAC: 217
Message 35845 - Posted: 6 Jan 2009, 14:34:47 UTC

I have two Fedora 10 hosts attached to the Beta site, and I had forgotten to install the 32-bit compatibility libraries on one of them, and the first group of downloads (quad core) errored out with the same messages you got. Installed the libraries, and all was good.

Earlier versions of Fedora, at least with a workstation install, automatically installed the compatibility libraries.
ID: 35845 · Report as offensive     Reply Quote
old_user170894
Avatar

Send message
Joined: 3 Mar 06
Posts: 96
Credit: 353,185
RAC: 0
Message 35849 - Posted: 6 Jan 2009, 17:20:35 UTC
Last modified: 6 Jan 2009, 17:23:24 UTC

I don\'t know about Fedora 10 but Fedora 5 through 9 (64-bit) installers ask if you want the ability to run 32-bit apps. If you answer YES then it will install the 32-bit compatibility libs. Unfortunately it seems like a lot of BOINCers answer NO so they have to install the 32-bit compat libs manually later on.

FalconFly, go to http://boinc.ssl.berkeley.edu/wiki/Installing_on_Linux and scroll to the bottom of the page to the section titled \"64-bit Considerations\" where you will find the command for installing the 32-bit compatibilty libs. If you already have the libs installed then yum will see that and tell you so. If not installed then yum will install them. Far easier for you to try that than for the admins to scan the list of hosts for 64-bit Fedora systems. And even if they agreed to do that, what would it tell you other than those hosts, if they\'re crunching, have the 32-bit compatibility libs installed?
ID: 35849 · Report as offensive     Reply Quote
old_user3434
Avatar

Send message
Joined: 30 Aug 04
Posts: 77
Credit: 1,785,934
RAC: 0
Message 35852 - Posted: 6 Jan 2009, 19:29:13 UTC

Interesting, the normal behaviour of Fedora 8 and 9 x86_64 made me assume Fedora 10 would react identical and since my other Projects (POEM, MalariaControl, SETI, Einstein, Rosetta) not requiring any 32bit compatibility I did not expect any different.

Thanks for the hints, I\'ve installed the compat libs and will now have to wait how that works (Host will attempt another Model download in ~5 hrs)
Scientific Network : 44800 MHz - 77824 MB - 1970 GB
ID: 35852 · Report as offensive     Reply Quote
old_user3434
Avatar

Send message
Joined: 30 Aug 04
Posts: 77
Credit: 1,785,934
RAC: 0
Message 35857 - Posted: 7 Jan 2009, 0:39:03 UTC - in response to Message 35852.  

Good news, the Client is now finally able to launch.

Thanks for the quick help :)
Scientific Network : 44800 MHz - 77824 MB - 1970 GB
ID: 35857 · Report as offensive     Reply Quote
old_user170894
Avatar

Send message
Joined: 3 Mar 06
Posts: 96
Credit: 353,185
RAC: 0
Message 35870 - Posted: 8 Jan 2009, 13:58:58 UTC - in response to Message 35857.  

You\'re welcome :)

ID: 35870 · Report as offensive     Reply Quote

Message boards : Number crunching : Weird Problem on one Host

©2024 climateprediction.net