climateprediction.net home page
Sulphur crash after project switch

Sulphur crash after project switch

Questions and Answers : Unix/Linux : Sulphur crash after project switch
Message board moderation

To post messages, you must log in.

AuthorMessage
Desti

Send message
Joined: 6 Aug 04
Posts: 124
Credit: 9,195,838
RAC: 0
Message 17465 - Posted: 27 Nov 2005, 0:37:23 UTC
Last modified: 27 Nov 2005, 0:37:42 UTC

Hello

I have some problems with one of my sulphur models, it crashes sometimes when BOINC switches to an other project.
I have BOINC 4.72, does BOINC 5 fix this?



sulphur_47q7_100296911 - PH 2 TS 0102923 A - 15/11/1831 05:30 - H:M:S=0400:24:08 AVG= 3.98 DLT= 2.96
2005-11-11 01:24:27 [Predictor @ Home] Sending scheduler request to http://predictor.scripps.edu/predictor_cgi/cgi
2005-11-11 01:24:27 [Predictor @ Home] Reason: To fetch work
2005-11-11 01:24:27 [Predictor @ Home] Requesting 8640 seconds of work, returning 0 results
sulphur_47q7_100296911 - PH 2 TS 0102924 A - 15/11/1831 06:00 - H:M:S=0400:24:10 AVG= 3.98 DLT= 1.93
2005-11-11 01:24:30 [Predictor @ Home] Scheduler request to http://predictor.scripps.edu/predictor_cgi/cgi succeeded
2005-11-11 01:24:31 [Predictor @ Home] Started download of bprion_1_91868.ini
2005-11-11 01:24:31 [Predictor @ Home] Started download of bprion_1_91868.inp
sulphur_47q7_100296911 - PH 2 TS 0102925 A - 15/11/1831 06:30 - H:M:S=0400:24:13 AVG= 3.98 DLT= 2.92
2005-11-11 01:24:32 [Predictor @ Home] Finished download of bprion_1_91868.ini
2005-11-11 01:24:32 [Predictor @ Home] Throughput 2367 bytes/sec
2005-11-11 01:24:32 [Predictor @ Home] Started download of bprion_1_91868.seq
2005-11-11 01:24:33 [Predictor @ Home] Finished download of bprion_1_91868.inp
2005-11-11 01:24:33 [Predictor @ Home] Throughput 107 bytes/sec
2005-11-11 01:24:33 [Predictor @ Home] Finished download of bprion_1_91868.seq
2005-11-11 01:24:33 [Predictor @ Home] Throughput 2339 bytes/sec
2005-11-11 01:24:33 [Predictor @ Home] Started download of bprion_1_91868.res
2005-11-11 01:24:33 [Predictor @ Home] Started download of bprion_1_92128.ini
2005-11-11 01:24:34 [Predictor @ Home] Finished download of bprion_1_91868.res
2005-11-11 01:24:34 [Predictor @ Home] Throughput 3 bytes/sec
2005-11-11 01:24:34 [Predictor @ Home] Started download of bprion_1_92128.inp
2005-11-11 01:24:34 [---] request_reschedule_cpus: files downloaded
2005-11-11 01:24:34 [climateprediction.net] Pausing result sulphur_47q7_100296911_0 (removed from memory)
2005-11-11 01:24:34 [Predictor @ Home] Starting result bprion_1_91868_3 using mfoldB125 version 4.29
2005-11-11 01:24:35 [Predictor @ Home] Finished download of bprion_1_92128.ini
2005-11-11 01:24:35 [Predictor @ Home] Throughput 1264 bytes/sec
2005-11-11 01:24:35 [Predictor @ Home] Finished download of bprion_1_92128.inp
2005-11-11 01:24:35 [Predictor @ Home] Throughput 158 bytes/sec
2005-11-11 01:24:35 [Predictor @ Home] Started download of bprion_1_92128.seq
CPDN Monitor got quit request...
Cleaning up graphics data...
Closing graphics shared object file...
Detaching shared memory...
2005-11-11 01:24:35 [Predictor @ Home] Started download of bprion_1_92128.res
2005-11-11 01:24:36 [climateprediction.net] Unrecoverable error for result sulphur_47q7_100296911_0 (process got signal 11)
2005-11-11 01:24:36 [climateprediction.net] Unrecoverable error for result sulphur_47q7_100296911_0 (process got signal 11)
2005-11-11 01:24:36 [---] request_reschedule_cpus: process exited
2005-11-11 01:24:36 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2005-11-11 01:24:36 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2005-11-11 01:24:36 [Predictor @ Home] Finished download of bprion_1_92128.seq
2005-11-11 01:24:36 [Predictor @ Home] Throughput 2688 bytes/sec
2005-11-11 01:24:36 [climateprediction.net] Computation for result sulphur_47q7_100296911_0 finished
2005-11-11 01:24:37 [Predictor @ Home] Finished download of bprion_1_92128.res
2005-11-11 01:24:37 [Predictor @ Home] Throughput 2 bytes/sec
2005-11-11 01:24:37 [---] request_reschedule_cpus: files downloaded
2005-11-11 02:20:22 [---] request_reschedule_cpus: process exited
2005-11-11 02:20:22 [Predictor @ Home] Computation for result bprion_1_91868_3 finished
2005-11-11 02:20:22 [LHC@home] Restarting result wnov1C_v6s4hvnom_mqx__2__64.291_59.311__6_8__6__70_1_sixvf_boinc89374_0 using sixtrack version 4.66
2005-11-11 02:20:24 [Predictor @ Home] Started upload of bprion_1_91868_3_0
2005-11-11 02:20:25 [Predictor @ Home] Started upload of bprion_1_91868_3_1
2005-11-11 02:20:28 [Predictor @ Home] Finished upload of bprion_1_91868_3_0
2005-11-11 02:20:28 [Predictor @ Home] Throughput 4400 bytes/sec
2005-11-11 02:20:29 [Predictor @ Home] Started upload of bprion_1_91868_3_2
2005-11-11 02:20:34 [Predictor @ Home] Finished upload of bprion_1_91868_3_1
2005-11-11 02:20:34 [Predictor @ Home] Throughput 15155 bytes/sec
2005-11-11 02:20:40 [Predictor @ Home] Finished upload of bprion_1_91868_3_2
2005-11-11 02:20:40 [Predictor @ Home] Throughput 2669 bytes/sec
2005-11-11 02:28:35 [LHC@home] Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
2005-11-11 02:28:35 [LHC@home] Reason: To report results
2005-11-11 02:28:35 [LHC@home] Requesting 0 seconds of work, returning 1 results2005-11-11 02:28:37 [LHC@home] Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
2005-11-11 02:28:38 [LHC@home] Deferring communication with project for 5 seconds
2005-11-11 02:28:38 [LHC@home] Deferring communication with project for 5 seconds
2005-11-11 03:48:36 [climateprediction.net] Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2005-11-11 03:48:36 [climateprediction.net] Reason: To report results
2005-11-11 03:48:36 [climateprediction.net] Requesting 0 seconds of work, returning 1 results
2005-11-11 03:48:39 [climateprediction.net] Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
2005-11-11 04:18:58 [LHC@home] Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi



sulphur_47q7_100296911 - PH 3 TS 0075402 A - 11/04/1845 21:00 - H:M:S=0658:32:38 AVG= 3.99 DLT= 1.97
sulphur_47q7_100296911 - PH 3 TS 0075403 A - 11/04/1845 21:30 - H:M:S=0658:32:41 AVG= 3.99 DLT= 2.97
sulphur_47q7_100296911 - PH 3 TS 0075404 A - 11/04/1845 22:00 - H:M:S=0658:32:52 AVG= 3.99 DLT=10.99
sulphur_47q7_100296911 - PH 3 TS 0075405 A - 11/04/1845 22:30 - H:M:S=0658:32:55 AVG= 3.99 DLT= 2.97
sulphur_47q7_100296911 - PH 3 TS 0075406 A - 11/04/1845 23:00 - H:M:S=0658:32:57 AVG= 3.99 DLT= 1.97
sulphur_47q7_100296911 - PH 3 TS 0075407 A - 11/04/1845 23:30 - H:M:S=0658:33:00 AVG= 3.99 DLT= 2.97
sulphur_47q7_100296911 - PH 3 TS 0075408 A - 12/04/1845 00:00 - H:M:S=0658:33:01 AVG= 3.99 DLT= 1.97
sulphur_47q7_100296911 - PH 3 TS 0075409 A - 12/04/1845 00:30 - H:M:S=0658:33:04 AVG= 3.99 DLT= 2.97
2005-11-26 01:12:04 [LHC@home] Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
2005-11-26 01:12:04 [LHC@home] Reason: To fetch work
2005-11-26 01:12:04 [LHC@home] Requesting 8640 seconds of work, returning 0 results
2005-11-26 01:12:05 [LHC@home] Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
2005-11-26 01:12:06 [LHC@home] Started download of wjun4nmC_v6s4hhpac_mqx-nm__5__64.302_59.312__4_6__6__10_1_sixvf_boinc63923.zip
2005-11-26 01:12:07 [LHC@home] Finished download of wjun4nmC_v6s4hhpac_mqx-nm__5__64.302_59.312__4_6__6__10_1_sixvf_boinc63923.zip
2005-11-26 01:12:07 [LHC@home] Throughput 32581 bytes/sec
2005-11-26 01:12:07 [---] request_reschedule_cpus: files downloaded
2005-11-26 01:12:07 [climateprediction.net] Pausing result sulphur_47q7_100296911_0 (removed from memory)
2005-11-26 01:12:07 [LHC@home] Starting result wjun4nmC_v6s4hhpac_mqx-nm__5__64.302_59.312__4_6__6__10_1_sixvf_boinc63923_0 using sixtrack version 4.66
CPDN Monitor got quit request...
Cleaning up graphics data...
Closing graphics shared object file...
Detaching shared memory...
2005-11-26 01:12:09 [climateprediction.net] Unrecoverable error for result sulphur_47q7_100296911_0 (process got signal 11)
2005-11-26 01:12:09 [climateprediction.net] Unrecoverable error for result sulphur_47q7_100296911_0 (process got signal 11)
2005-11-26 01:12:09 [---] request_reschedule_cpus: process exited
2005-11-26 01:12:09 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2005-11-26 01:12:09 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2005-11-26 01:12:09 [climateprediction.net] Computation for result sulphur_47q7_100296911_0 finished




sulphur_47q7_100296911 - PH 3 TS 0059267 A - 05/05/1844 17:30 - H:M:S=0640:36:00 AVG= 3.99 DLT= 1.97
sulphur_47q7_100296911 - PH 3 TS 0059268 A - 05/05/1844 18:00 - H:M:S=0640:36:03 AVG= 3.99 DLT= 2.97
sulphur_47q7_100296911 - PH 3 TS 0059269 A - 05/05/1844 18:30 - H:M:S=0640:36:05 AVG= 3.99 DLT= 1.97
2005-11-26 15:23:52 [climateprediction.net] Pausing result sulphur_47q7_100296911_0 (removed from memory)
2005-11-26 15:23:52 [LHC@home] Restarting result wjun4nmC_v6s4hhpac_mqx-nm__18__64.293_59.303__8_10__6__75_1_sixvf_boinc81922_0 using sixtrack version 4.66
CPDN Monitor got quit request...
Cleaning up graphics data...
Closing graphics shared object file...
2005-11-26 15:23:55 [climateprediction.net] Unrecoverable error for result sulphur_47q7_100296911_0 (process got signal 11)
2005-11-26 15:23:55 [climateprediction.net] Unrecoverable error for result sulphur_47q7_100296911_0 (process got signal 11)
2005-11-26 15:23:55 [---] request_reschedule_cpus: process exited
2005-11-26 15:23:55 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2005-11-26 15:23:55 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2005-11-26 15:23:55 [climateprediction.net] Computation for result sulphur_47q7_100296911_0 finished
2005-11-26 15:59:02 [---] request_reschedule_cpus: process exited
2005-11-26 15:59:02 [LHC@home] Computation for result wjun4nmC_v6s4hhpac_mqx-nm__18__64.293_59.303__8_10__6__75_1_sixvf_boinc81922_0 finished
2005-11-26 15:59:02 [LHC@home] Starting result wjun4nmC_v6s4hhpac_mqx-nm__10__64.3_59.31__6_8__6__80_1_sixvf_boinc70958_4 using sixtrack version 4.66
2005-11-26 15:59:03 [LHC@home] Started upload of wjun4nmC_v6s4hhpac_mqx-nm__18__64.293_59.303__8_10__6__75_1_sixvf_boinc81922_0_0
2005-11-26 15:59:07 [LHC@home] Finished upload of wjun4nmC_v6s4hhpac_mqx-nm__18__64.293_59.303__8_10__6__75_1_sixvf_boinc81922_0_0
2005-11-26 15:59:07 [LHC@home] Throughput 34931 bytes/sec
2005-11-26 17:47:55 [climateprediction.net] Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2005-11-26 17:47:55 [climateprediction.net] Reason: To report results
2005-11-26 17:47:55 [climateprediction.net] Requesting 0 seconds of work, returning 1 results
2005-11-26 17:47:56 [climateprediction.net] Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
2005-11-26 17:48:14 [LHC@home] Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi

Linux Users Everywhere @ BOINC
ID: 17465 · Report as offensive     Reply Quote
old_user1
Avatar

Send message
Joined: 5 Aug 04
Posts: 907
Credit: 299,864
RAC: 0
Message 17466 - Posted: 27 Nov 2005, 0:48:32 UTC

version 5 does seem to be a bit better about suspending/resuming CPDN.
ID: 17466 · Report as offensive     Reply Quote
old_user2354

Send message
Joined: 28 Aug 04
Posts: 13
Credit: 767,708
RAC: 0
Message 17707 - Posted: 4 Dec 2005, 17:41:17 UTC - in response to Message 17466.  

version 5 does seem to be a bit better about suspending/resuming CPDN.

I can\'t completely agree with this. My both Linux boxes(Suse 10.0 and Debian stable ) run on 5.2.13 and they still crash sulphor WUs pretty quickly. In fact I didn\'t get any sulphor WU to trickle yet. It seems to me like there is some sort of bug in the sulphor client that makes it crash on pausing of the model. That makes CPDN quite hard to use on Linux :-(
ID: 17707 · Report as offensive     Reply Quote
haddock29

Send message
Joined: 13 Sep 04
Posts: 4
Credit: 2,286,393
RAC: 0
Message 18319 - Posted: 17 Dec 2005, 21:06:38 UTC

I got that signal 11 error after CDPN paused and removed from memory. I had it twice because boinc paused in order to make a cpu bench, and I thougt thatt the problem was the cpu bench. It could well be the pause + remove from memory. I have another computer which switches very easily between seti and cpdn (sinc at least one year...), but it leaves tasks in memory when switching.
The \"good\" computer runs fedora, and boinc 4.19. It is a bi-athlon.
The \"wrong\" runs boing 5.1.2, ah red hat enterprise. It is a bi-xeon HT (4 procs)
It is clearly not a problem of boinc version 4 versus 5.

I think you must try to activate \"leave in memory\" in your preferences, that may solve your problem.
And I have to understand why, with the same boinc preferences (and \"leave in memory\" set to yes), one of my computers removes tasks from memory...
And welcome to the \"signal 11\" club.
ID: 18319 · Report as offensive     Reply Quote
old_user8677

Send message
Joined: 2 Sep 04
Posts: 2
Credit: 94,653
RAC: 0
Message 18326 - Posted: 18 Dec 2005, 5:41:02 UTC - in response to Message 17466.  

I\'m running at 100% on sulphur failing to reload after project switch.
Seti@Home and World Grid both switch perfectly. I think I\'ll stop trying to run CPDN on Linux.

Dual P3 with 1.75G RAM
Linux 2.4.21
BOINC client version 5.2.13 for i686-pc-linux-gnu

The following is typical. CPDN downloads, scheduler preempts one of the other projects and CPDN starts. Scheduler preempts CPDN. Scheduler preempts other project for CPDN, which fails to resume. CPDN decides w/u is complete and then fails trying to upload the result.

2005-12-15 21:46:04 [climateprediction.net] Started download of sulphur_gd1k_000763400.zip
2005-12-15 21:46:07 [climateprediction.net] Finished download of sulphur_gd1k_000763400.zip
2005-12-15 21:46:07 [climateprediction.net] Throughput 9995 bytes/sec
2005-12-15 21:46:09 [---] request_reschedule_cpus: files downloaded
2005-12-15 21:46:09 [World Community Grid] Pausing result eb408_2E_2 (left in memory)
2005-12-15 21:46:09 [climateprediction.net] Starting result sulphur_gd1k_000763400_0 using sulphur_cycle version 422
Archive: sulphur_data_4.22_i686-pc-linux-gnu.zip
Starting model in /var/BOINC/ajhood/projects/ClimatePrediction.net...
creating: sulphur_gd1k_000763400/datain/
inflating: sulphur_gd1k_000763400/datain/ppcodes
creating: sulphur_gd1k_000763400/datain/dumps/
inflating: sulphur_gd1k_000763400/datain/dumps/slab32_1810.start
inflating: sulphur_gd1k_000763400/datain/lats
creating: sulphur_gd1k_000763400/datain/ancil/
inflating: sulphur_gd1k_000763400/datain/ancil/qrclim.uvcurr.32
inflating: sulphur_gd1k_000763400/datain/ancil/qrclim.ozone_preind_corr
inflating: sulphur_gd1k_000763400/datain/ancil/SULPC_OXIDANTS_19_A2_1990
inflating: sulphur_gd1k_000763400/datain/ancil/2050_DMSW_MATC
inflating: sulphur_gd1k_000763400/datain/ancil/DMSW_NH3_SO21985
inflating: sulphur_gd1k_000763400/datain/ancil/qrclim.newsst5.32
creating: sulphur_gd1k_000763400/datain/ancil/ctldata/
creating: sulphur_gd1k_000763400/datain/ancil/ctldata/STASHmaster/
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/STASHmaster/STASHmaster_A
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/STASHmaster/STASHmaster_S
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/STASHmaster/STASHmaster_O
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/STASHmaster/STASHmaster_W
creating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01002207
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01005223
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003278
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01005208
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003274
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003286
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003276
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003275
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003280
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01001218
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01010206
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01005207
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003281
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003236
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003237
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003279
extracting: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003255
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01005222
extracting: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003254
inflating: sulphur_gd1k_000763400/datain/ancil/ctldata/stasets/X01003277
inflating: sulphur_gd1k_000763400/datain/ancil/NAT_VOLC
inflating: sulphur_gd1k_000763400/datain/ancil/qrclim.icedp.32
creating: sulphur_gd1k_000763400/dataout/
inflating: sulphur_gd1k_000763400/registration_license.txt
creating: sulphur_gd1k_000763400/jobs/
inflating: sulphur_gd1k_000763400/jobs/spec3a_lw_3_asol2c_hadcm3
extracting: sulphur_gd1k_000763400/jobs/yabsd.PRESM_S
inflating: sulphur_gd1k_000763400/jobs/spec3a_sw_3_asol2b_hadcm3
inflating: sulphur_gd1k_000763400/jobs/cont.stashc
inflating: sulphur_gd1k_000763400/jobs/cont.so2.stashc
inflating: sulphur_gd1k_000763400/jobs/recona.12
inflating: sulphur_gd1k_000763400/jobs/doub.so2.stashc
inflating: sulphur_gd1k_000763400/jobs/yabsd.stashc
inflating: sulphur_gd1k_000763400/jobs/recona.15
inflating: sulphur_gd1k_000763400/jobs/spin.stashc
inflating: sulphur_gd1k_000763400/jobs/recona.14
inflating: sulphur_gd1k_000763400/jobs/recona.13
inflating: sulphur_gd1k_000763400/jobs/yabsd.ihist
inflating: sulphur_gd1k_000763400/jobs/yabsd.PRESM_A
inflating: sulphur_gd1k_000763400/jobs/doub.stashc
creating: sulphur_gd1k_000763400/tmp/
Archive: sulphur_gd1k_000763400.zip
inflating: sulphur_gd1k_000763400/jobs/climate.spin
inflating: sulphur_gd1k_000763400/jobs/climate.cont
inflating: sulphur_gd1k_000763400/jobs/climate.doub
inflating: sulphur_gd1k_000763400/jobs/climate.so2.cont
inflating: sulphur_gd1k_000763400/jobs/climate.so2.doub
inflating: sulphur_gd1k_000763400/jobs/ncatts.cpdc
Created shared memory region key = 76190 of size 569976 bytes
.so shmem return code = 0
Copying files for startup...
In pre_initialise_phase (part 1 of 3)
In initialise_phase (part 2 of 3)
In startup_phase (part 3 of 3)
Starting model ID sulphur_gd1k_000763400 Phase 1
Waiting for model startup, this may take a minute...
2005-12-15 21:57:55 [---] request_reschedule_cpus: process exited
2005-12-15 21:57:55 [SETI@home] Computation for result 22ap04ab.11574.11282.742326.103_4 finished
2005-12-15 21:57:55 [SETI@home] Resuming result 17mr05ab.17159.29378.729828.243_6 using setiathome version 402
2005-12-15 21:57:55 [climateprediction.net] Pausing result sulphur_gd1k_000763400_0 (left in memory)
2005-12-15 21:57:55 [SETI@home] Starting result 27se04aa.18706.7488.934640.62_0 using setiathome version 402
2005-12-15 21:57:58 [SETI@home] Started upload of 22ap04ab.11574.11282.742326.103_4_0
2005-12-15 21:58:09 [SETI@home] Finished upload of 22ap04ab.11574.11282.742326.103_4_0
2005-12-15 21:58:09 [SETI@home] Throughput 4819 bytes/sec
2005-12-15 22:57:55 [World Community Grid] Resuming result eb408_2E_2 using rosetta version 421
2005-12-15 22:57:55 [SETI@home] Pausing result 17mr05ab.17159.29378.729828.243_6 (left in memory)
2005-12-15 22:57:55 [climateprediction.net] Resuming result sulphur_gd1k_000763400_0 using sulphur_cycle version 422
2005-12-15 22:57:55 [SETI@home] Pausing result 27se04aa.18706.7488.934640.62_0 (left in memory)
Resuming CPDN!
Model timeout at 180.00 seconds
Preparing for restart...
Rewinding a model-day...
Starting model ID sulphur_gd1k_000763400 Phase 1
Waiting for model startup, this may take a minute...
2005-12-15 23:04:33 [---] request_reschedule_cpus: process exited
2005-12-15 23:04:33 [World Community Grid] Computation for result eb408_2E_2 finished
2005-12-15 23:04:33 [World Community Grid] Resuming result eb408_06_0 using rosetta version 421
2005-12-15 23:04:36 [World Community Grid] Started upload of eb408_2E_2_0
2005-12-15 23:06:46 [World Community Grid] Finished upload of eb408_2E_2_0
2005-12-15 23:06:46 [World Community Grid] Throughput 11838 bytes/sec
Model timeout at 180.00 seconds
Preparing for restart...
Rewinding a model-month...
Error: Restart files for dataout/restart.month not found
Giving up, this result exceeded crash count for available restart files.
deflating : restart.day
deflating : yabsd.out
2005-12-15 23:10:18 [---] request_reschedule_cpus: process exited
2005-12-15 23:10:18 [climateprediction.net] Computation for result sulphur_gd1k_000763400_0 finished
2005-12-15 23:10:18 [World Community Grid] Starting result eb408_30_2 using rosetta version 421
2005-12-15 23:10:18 [climateprediction.net] Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2005-12-15 23:10:18 [climateprediction.net] Reason: To fetch work
2005-12-15 23:10:18 [climateprediction.net] Requesting 86400 seconds of new work
2005-12-15 23:10:19 [climateprediction.net] Unrecoverable error for result sulphur_gd1k_000763400_0 (<file_xfer_error>
<file_name>sulphur_gd1k_000763400_0_1.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_gd1k_000763400_0_2.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_gd1k_000763400_0_3.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_gd1k_000763400_0_4.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_gd1k_000763400_0_5.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
)

ID: 18326 · Report as offensive     Reply Quote
Desti

Send message
Joined: 6 Aug 04
Posts: 124
Credit: 9,195,838
RAC: 0
Message 18387 - Posted: 18 Dec 2005, 23:18:41 UTC - in response to Message 18326.  

Have you checked your RAM?
Does this also happen, when you don\'t left it in memory?
Linux Users Everywhere @ BOINC
ID: 18387 · Report as offensive     Reply Quote
old_user8677

Send message
Joined: 2 Sep 04
Posts: 2
Credit: 94,653
RAC: 0
Message 18407 - Posted: 19 Dec 2005, 11:25:15 UTC - in response to Message 18387.  

Have you checked your RAM?
Does this also happen, when you don\'t left it in memory?


RAM errors I\'ve never had. For instance Gimp runs quite happily with images over 1GB (wihout paging), and gcc both compiles (a good test of RAM) and compiles some sources that require over 300MB.

As far as I can tell this is exactly the same behaviour as when I had leave in memory turned off.
ID: 18407 · Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : Sulphur crash after project switch

©2024 climateprediction.net