climateprediction.net home page
boinc runs rosetta fine although climateprediction crashes at startup?

boinc runs rosetta fine although climateprediction crashes at startup?

Questions and Answers : Unix/Linux : boinc runs rosetta fine although climateprediction crashes at startup?
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user189455

Send message
Joined: 6 Jun 06
Posts: 1
Credit: 154,587
RAC: 0
Message 23032 - Posted: 6 Jun 2006, 11:54:45 UTC
Last modified: 6 Jun 2006, 11:57:33 UTC

Logs included below although the story is I have multiple xeon servers running linux, all running boinc & previously 100% rosetta. I was wanting to run some climateprediction although get what appears to be the same crash on every server.

Run trustix 2.2 as the distribution (designed for minimal install servers), kernel 2.6, glibc 2.3 all the essentials although no x windows as its not needed and I dont want boinc graphics.

ldd on the climate binary shows all required libraries are present:
ldd hadcm3trans_5.08_i686-pc-linux-gnu
libpthread.so.0 => /lib/libpthread.so.0 (0x40019000)
libc.so.6 => /lib/libc.so.6 (0x4006b000)
libdl.so.2 => /lib/libdl.so.2 (0x4019b000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

logs as follows:

creating: hadcm3lbm_avh7_05276622/dataout/
Archive: hadcm3lbm_avh7_05276622.zip
Unzipping workunit data file...
inflating: hadcm3lbm_avh7_05276622/jobs/yafbg.namelists
inflating: hadcm3lbm_avh7_05276622/jobs/ncatts.cpdc
Created shared memory region key = 87855 of size 655036 bytes
Reconstructing ancillary file DMSNH3SO2A1B_19182082N, this may take a minute!
Reconstructing ancillary file SULPC_OXIDANTS_19_A2_1990, this may take a minute!
Reconstructing ancillary file ozone_hadcm3_1919_2082, this may take a minute!
Can\'t setup pointer to .so shared memory hadcm3trans_5.08_i686-pc-linux-gnu: undefined symbol: setupSharedMem!
Can\'t setup pointer to .so graphics cleanup hadcm3trans_5.08_i686-pc-linux-gnu: undefined symbol: graphics_thread_cleanup
Copying files for startup...
Copying climate.cpdc files...
Starting model ID hadcm3lbm_avh7_05276622 Phase 1
Climate model starting - use graphics to monitor progress.
Or visit the website to see the graphs for this run.
Preparing for restart...
Rewinding a model-day...
Cleaning up graphics data...
Detaching shared memory...
2006-06-06 18:22:16 [---] Rescheduling CPU: application exited
2006-06-06 18:22:16 [climateprediction.net] Computation for task hadcm3lbm_avh7_05276622_1 finished
2006-06-06 18:22:16 [rosetta@home] Resuming task FRA_t311_CASP7_hom001_2_1b0nA_IGNORE_THE_REST_244_663_3_0 using rosetta version 516
2006-06-06 18:22:17 [climateprediction.net] Unrecoverable error for result hadcm3lbm_avh7_05276622_1 (file_xfer_error
file_name hadcm3lbm_avh7_05276622_1_1.zip
error code -161
file_xfer_error)

Same error continues with various filenames:
hadcm3lbm_avh7_05276622_1_2.zip
hadcm3lbm_avh7_05276622_1_3.zip
hadcm3lbm_avh7_05276622_1_4.zip
hadcm3lbm_avh7_05276622_1_5.zip
hadcm3lbm_avh7_05276622_1_6.zip
hadcm3lbm_avh7_05276622_1_7.zip
hadcm3lbm_avh7_05276622_1_8.zip

2006-06-06 18:22:17 [climateprediction.net] Deferring scheduler requests for 1 minutes and 0 seconds
2006-06-06 18:23:20 [climateprediction.net] Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2006-06-06 18:23:20 [climateprediction.net] Reason: To report completed tasks
2006-06-06 18:23:20 [climateprediction.net] Requesting 172800 seconds of new work, and reporting 1 completed tasks
2006-06-06 18:23:26 [climateprediction.net] Scheduler request succeeded
2006-06-06 18:23:26 [climateprediction.net] Message from server: No work sent
2006-06-06 18:23:26 [climateprediction.net] Message from server: (reached daily quota of 2 results)

ID: 23032 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 23083 - Posted: 9 Jun 2006, 22:06:13 UTC

Run trustix 2.2 as the distribution


Can\'t give you a definitive answer despite having run Linux for years. However, there are distros that simply won\'t work with CPDN, because of library conflicts.

Don\'t recall having seen trustix, so that might be it. That conjecture seems reinforced by the across-the-board nature of your experience.

The -161 error code has to do with trying to upload \'empty\' files and masks the real error.

\'No work sent\' is a database protective limitation -- to keep problem systems from draining the supply of Work Units.

There are some Linux gurus around and I hope one passes by. (I posted only because your post is days old and deserves some attention.)

Hope you get it sorted because we\'d like to have your horsepower added to the stable.

Best of luck and Welcome aboard!
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 23083 · Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : boinc runs rosetta fine although climateprediction crashes at startup?

©2024 climateprediction.net