climateprediction.net home page
Posts by old_user6988

Posts by old_user6988

1) Questions and Answers : Macintosh : Model crashes as it starts up (Message 3503)
Posted 8 Sep 2004 by old_user6988
Post:
I too have gotten some working models.....

I didn't detach but one or two machines, I can't run around to all the machines and do that, so I will let them try and pick up work when they can.

The only problem I see now is when I go to results to look and see who has what units. The Host is always coming up as hidden. Therefor I can't check to see which machines have already picked up new work.

drweaser

> > This is happening to many folks.
> > No one seems to know what the error number means or have any solution.
> > You can detach and reattach to the project every time your daily quota
> of
> > models is exceeded to keep trying new models, and hope to get a working
> model
> > some time.
> >
> >Boinc is fixed and the model is running. Thnaks!
>
>
<a href="http://www.boinc.dk/index.php?page=user_statistics&amp;project=sah&amp;userid=5800"><img border="0" height="80" src="http://5800.sah.sig.boinc.dk?188"></a>
2) Questions and Answers : Macintosh : Cannot get a model to run (Message 3180)
Posted 6 Sep 2004 by old_user6988
Post:
Got a different error this time.....looks like some sort of memory problem.....maybe this will help somebody figure it out....

Starting model in /Applications/boinc/projects/climateprediction.net...
Archive: hadsm3se_4.03_powerpc-apple-darwin.zip
inflating: ./hadsm3se_4.03_powerpc-apple-darwin
inflating: ./libxlf90.A.dylib
...
inflating: 1sdl_100104286/jobs/climate.doub
inflating: 1sdl_100104286/jobs/ncatts.cpdc
inflating: 1sdl_100104286/jobs/spec3a_sw_3_asol2b_hadcm3


Could not create shared memory region 25780, 301228



2004-09-06 15:41:44 [climateprediction.net] Unrecoverable error for result 1sdl_100104286_1 (process exited with code 255 (0xff))
2004-09-06 15:41:44 [climateprediction.net] Unrecoverable error for result 1sdl_100104286_1 (process exited with code 255 (0xff))
2004-09-06 15:41:44 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2004-09-06 15:41:44 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2004-09-06 15:41:44 [climateprediction.net] Computation for result 1sdl_100104286 finished
<a href="http://www.boinc.dk/index.php?page=user_statistics&amp;project=sah&amp;userid=5800"><img border="0" height="80" src="http://5800.sah.sig.boinc.dk?188"></a>
3) Questions and Answers : Macintosh : Cannot get a model to run (Message 3176)
Posted 6 Sep 2004 by old_user6988
Post:
So out of my 30 eMacs, 5 are currently running a model. Questions 1. Should I reset those....they were the ones tha had the negative computer?

Question 2. Should I reset the remaining 25 machines or simply let them keep trying to get a usable WU?

At least now the trickles show up under my account, thanks for the fix!

drweaser

&gt; I think I've corrected a lot of the oddities such as the "negative" computer.
&gt; The problem is there seems to be some "orphan" runs out there, i.e. a problem
&gt; with the BOINC "feeder" (which has since been fixed) means that some runs were
&gt; sent out with incorrect parent records, so there are a bunch of runs out there
&gt; that will not trickle or update properly. If your machine is one, you're
&gt; probably better off resetting and attaching so that you get a "proper" run.
&gt;
&gt;
&gt;
<a href="http://www.boinc.dk/index.php?page=user_statistics&amp;project=sah&amp;userid=5800"><img border="0" height="80" src="http://5800.sah.sig.boinc.dk?188"></a>
4) Questions and Answers : Macintosh : Cannot get a model to run (Message 3153)
Posted 6 Sep 2004 by old_user6988
Post:
Now what the Hades is going on....

check this out

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=141230

now it can't figure out the host computer for this work unit.....something is bonkers somewhere.....

drweaser


&gt; I have about 6 machines out the 30 that are now working from the returned
&gt; results. So there is something in the WU's that ain't right. Hopefully they
&gt; will find it soon so that the WU's don't constantly get downloaded ad rejected
&gt; and reldeownloaded. I wonder if they have taken the bad ones out of the pool,
&gt; or not.....I seem to remember that predicator had to do something like that
&gt; with bad charmm units.
&gt;
&gt; Drweaser

<a href="http://www.boinc.dk/index.php?page=user_statistics&amp;project=sah&amp;userid=5800"><img border="0" height="80" src="http://5800.sah.sig.boinc.dk?188"></a>
5) Questions and Answers : Macintosh : Cannot get a model to run (Message 3144)
Posted 6 Sep 2004 by old_user6988
Post:
I have about 6 machines out the 30 that are now working from the returned results. So there is something in the WU's that ain't right. Hopefully they will find it soon so that the WU's don't constantly get downloaded ad rejected and reldeownloaded. I wonder if they have taken the bad ones out of the pool, or not.....I seem to remember that predicator had to do something like that with bad charmm units.

Drweaser



&gt; &gt; Hi drweaser,
&gt; &gt;
&gt; &gt; Your problem <i>might</i> be related to a Visual Fortran error that's
&gt; been
&gt; &gt; afflicting the windows build recently. Seems that some workunits have
&gt; gone out
&gt; &gt; with a duff file.
&gt; &gt;
&gt;
&gt; No it doesn't. That problem was found to be caused by corrupt datafiles
&gt; generated by the server. It is fixed now - and I still have this error on my
&gt; two Macs as well.
&gt; Just attached for the first time on these machines, so it cannot be anything
&gt; old residual either...
&gt;
&gt; Hope the admins will look into this.
&gt;
&gt;
<a href="http://www.boinc.dk/index.php?page=user_statistics&amp;project=sah&amp;userid=5800"><img border="0" height="80" src="http://5800.sah.sig.boinc.dk?188"></a>
6) Questions and Answers : Macintosh : Cannot get a model to run (Message 3038)
Posted 4 Sep 2004 by old_user6988
Post:
Yep, Definitely still having problems.....

Anyone interested can check my results.....
<a href="http://climateapps2.oucs.ox.ac.uk/cpdnboinc/results.php?userid=6988">drweaser</a>

Didn't find that output file? Maybe it was already uploaded. Tried on ~30 machines all Mac OS X 10.3.5 with 384M+ RAM and 40G+ HD Space, doing nothing else right since neither predictor nor seti is sending out work.
<a href="http://www.boinc.dk/index.php?page=user_statistics&amp;project=sah&amp;userid=5800"><img border="0" height="80" src="http://5800.sah.sig.boinc.dk?188"></a>
7) Message boards : Cafe CPDN : Signature test thread to test siggy with cpdn.. (Message 3020)
Posted 4 Sep 2004 by old_user6988
Post:
try2


<a href="http://www.boinc.dk/index.php?page=user_statistics&amp;project=sah&amp;userid=5800"><img border="0" height="80" src="http://5800.sah.sig.boinc.dk?188"></a>
8) Message boards : Cafe CPDN : Signature test thread to test siggy with cpdn.. (Message 3018)
Posted 4 Sep 2004 by old_user6988
Post:
dfgdsdfgsdfg
<a>http://www.boinc.dk/index.php?page=user_statistics&amp;project=sah&amp;userid=5800</a>
9) Questions and Answers : Macintosh : Cannot get a model to run (Message 2780)
Posted 3 Sep 2004 by old_user6988
Post:
Just setup for CPDN, running fine for SAH. Downloads some data unzips, etc. and crashes.

This is what I get:

2004-09-02 20:45:00 [---] Insufficient work; requesting more
2004-09-02 20:45:00 [climateprediction.net] Requesting 15494 seconds of work
2004-09-02 20:45:01 [climateprediction.net] Sending request to scheduler: http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2004-09-02 20:45:02 [climateprediction.net] Scheduler RPC to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
2004-09-02 20:45:02 [climateprediction.net] Started download of 1pmo_000100690.zip
2004-09-02 20:45:03 [climateprediction.net] Finished download of 1pmo_000100690.zip
2004-09-02 20:45:03 [climateprediction.net] Throughput 22322 bytes/sec
2004-09-02 20:45:03 [climateprediction.net] Starting result 1pmo_000100690_0 using hadsm3 version 4.03
Starting model in /Applications/boinc/projects/climateprediction.net...
Archive: hadsm3data_4.03_powerpc-apple-darwin.zip
creating: 1pmo_000100690/datain/
creating: 1pmo_000100690/datain/ancil/
creating: 1pmo_000100690/datain/ancil/ctldata/
inflating: 1pmo_000100690/datain/ancil/ctldata/spec3a_lw_3_asol2c_hadcm3
inflating: 1pmo_000100690/datain/ancil/ctldata/spec3a_sw_3_asol2b_hadcm3
creating: 1pmo_000100690/datain/ancil/ctldata/stasets/
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01001218
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01002207
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003236
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003237
extracting: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003254
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003255
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003274
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003275
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003276
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003277
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003278
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003279
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003280
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003281
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01003286
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01005207
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01005208
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01005222
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01005223
inflating: 1pmo_000100690/datain/ancil/ctldata/stasets/X01010206
creating: 1pmo_000100690/datain/ancil/ctldata/STASHmaster/
inflating: 1pmo_000100690/datain/ancil/ctldata/STASHmaster/STASHmaster_A
inflating: 1pmo_000100690/datain/ancil/ctldata/STASHmaster/STASHmaster_O
inflating: 1pmo_000100690/datain/ancil/ctldata/STASHmaster/STASHmaster_S
inflating: 1pmo_000100690/datain/ancil/ctldata/STASHmaster/STASHmaster_W
inflating: 1pmo_000100690/datain/ancil/qrclim.icedp.32
inflating: 1pmo_000100690/datain/ancil/qrclim.newsst5.32
inflating: 1pmo_000100690/datain/ancil/qrclim.ozone_preind_corr
inflating: 1pmo_000100690/datain/ancil/qrclim.uvcurr.32
creating: 1pmo_000100690/datain/dumps/
inflating: 1pmo_000100690/datain/dumps/slab32_1810.start
inflating: 1pmo_000100690/datain/lats
inflating: 1pmo_000100690/datain/ppcodes
creating: 1pmo_000100690/dataout/
extracting: 1pmo_000100690/dataout/thist
creating: 1pmo_000100690/jobs/
inflating: 1pmo_000100690/jobs/control.stashc
inflating: 1pmo_000100690/jobs/double.stashc
inflating: 1pmo_000100690/jobs/Recona.12
inflating: 1pmo_000100690/jobs/Recona.13
inflating: 1pmo_000100690/jobs/spec3a_lw_3_asol2c_hadcm3
inflating: 1pmo_000100690/jobs/spec3a_sw_3_asol2b_hadcm3
inflating: 1pmo_000100690/jobs/spin.stashc
inflating: 1pmo_000100690/jobs/yabsd.ihist
inflating: 1pmo_000100690/jobs/yabsd.PRESM_A
extracting: 1pmo_000100690/jobs/yabsd.PRESM_O
extracting: 1pmo_000100690/jobs/yabsd.PRESM_S
extracting: 1pmo_000100690/jobs/yabsd.PRESM_W
inflating: 1pmo_000100690/jobs/yabsd.stashc
inflating: 1pmo_000100690/registration_license.txt
extracting: 1pmo_000100690/stderr_um.txt
extracting: 1pmo_000100690/stdout_um.txt
creating: 1pmo_000100690/tmp/
extracting: 1pmo_000100690/tmp/pipe_dummy
creating: 1pmo_000100690/viz/
inflating: 1pmo_000100690/viz/globe.rgb
Archive: 1pmo_000100690.zip
inflating: 1pmo_000100690/jobs/climate.spin
inflating: 1pmo_000100690/jobs/climate.cont
inflating: 1pmo_000100690/jobs/climate.doub
inflating: 1pmo_000100690/jobs/ncatts.cpdc
inflating: 1pmo_000100690/jobs/spec3a_sw_3_asol2b_hadcm3
Created shared memory region key = 25565
Env Used=DYLD_LIBRARY_PATH=/Applications/boinc/projects/climateprediction.net:../
Copying files for startup...
In pre_initialise_phase (part 1 of 3)
In initialise_phase (part 2 of 3)
In startup_phase (part 3 of 3)
Starting model ID 1pmo_000100690 Phase 1
Waiting for model startup, this may take a minute...
Stack size=48.00 MB
1pmo_000100690 - PH 1 TS 000001 - 00/00/0000 00:00 - H:M:S=0000:00:00 AVG= 0.00 DLT= 0.00
Model crashed...retrying...restart level 0
Preparing for restart...
Rewinding a model-day...
Starting model ID 1pmo_000100690 Phase 1
Stack size=48.00 MB
Waiting for model startup, this may take a minute...
1pmo_000100690 - PH 1 TS 000001 - 00/00/0000 00:00 - H:M:S=0000:00:00 AVG= 0.00 DLT= 0.00
Model crashed...retrying...restart level 1
Preparing for restart...
Rewinding a model-month...
Error: Restart files for dataout/restart.month not found
Giving up, this result exceeded crash count for available restart files.
adding: ncatts.cpdc (deflated 72%)
adding: climate.cont (deflated 79%)
adding: climate.cpdc (deflated 79%)
adding: climate.doub (deflated 78%)
adding: climate.spin (deflated 79%)
adding: 1pmo_000100690.xml (deflated 65%)
adding: ncatts.cpdc (deflated 72%)
adding: ncatts.cpdc (deflated 72%)
adding: ncatts.cpdc (deflated 72%)
adding: stderr_um.txt (stored 0%)
adding: yabsd.out (deflated 93%)
adding: restart.day (deflated 43%)
2004-09-02 20:45:14 [climateprediction.net] Unrecoverable error for result 1pmo_000100690_0 (process exited with code 251 (0xfb))
2004-09-02 20:45:14 [climateprediction.net] Unrecoverable error for result 1pmo_000100690_0 (process exited with code 251 (0xfb))
2004-09-02 20:45:14 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2004-09-02 20:45:14 [climateprediction.net] Deferring communication with project for 1 minutes and 0 seconds
2004-09-02 20:45:14 [climateprediction.net] Computation for result 1pmo_000100690 finished
2004-09-02 20:45:14 [climateprediction.net] Started upload of 1pmo_000100690_0_1.zip
2004-09-02 20:45:14 [climateprediction.net] Started upload of 1pmo_000100690_0_2.zip
2004-09-02 20:45:15 [climateprediction.net] Finished upload of 1pmo_000100690_0_1.zip
2004-09-02 20:45:15 [climateprediction.net] Throughput 3766 bytes/sec
2004-09-02 20:45:15 [climateprediction.net] Started upload of 1pmo_000100690_0_3.zip
2004-09-02 20:45:16 [climateprediction.net] Finished upload of 1pmo_000100690_0_2.zip
2004-09-02 20:45:16 [climateprediction.net] Throughput 102980 bytes/sec
2004-09-02 20:45:16 [climateprediction.net] Started upload of 1pmo_000100690_0_4.zip
2004-09-02 20:45:16 [climateprediction.net] Finished upload of 1pmo_000100690_0_3.zip
2004-09-02 20:45:16 [climateprediction.net] Throughput 3037 bytes/sec
2004-09-02 20:45:16 [climateprediction.net] Started upload of 1pmo_000100690_0_5.zip
2004-09-02 20:45:17 [climateprediction.net] Finished upload of 1pmo_000100690_0_4.zip
2004-09-02 20:45:17 [climateprediction.net] Throughput 3731 bytes/sec
2004-09-02 20:45:18 [climateprediction.net] Finished upload of 1pmo_000100690_0_5.zip
2004-09-02 20:45:18 [climateprediction.net] Throughput 117367 bytes/sec





©2024 climateprediction.net