climateprediction.net home page
Posts by old_user49040

Posts by old_user49040

1) Questions and Answers : Macintosh : Version 4.22 release (Message 17799)
Posted 6 Dec 2005 by old_user49040
Post:
ID of the failed results: 1348318, it has same error in MacLib.c as 4.21.
2) Questions and Answers : Macintosh : Version 4.22 release (Message 17769)
Posted 5 Dec 2005 by old_user49040
Post:
Do we need to upgrade the core client as well?

4.22 still crashes with the same error using BOINC client 5.2.8 :-(
3) Questions and Answers : Macintosh : Can\'t get any decent results, detached from project. (Message 17519)
Posted 29 Nov 2005 by old_user49040
Post:
Have you installed the additional Mac application library Koen? If not it\'s on the download page and this is a direct link.


I tried after reading your message; I installed libcpdn and the latest Mac client. Install went fine, registration okay, new work was downloaded and after that an immediate crash:

<core_client_version>5.2.8</core_client_version>
<message>process exited with code 255 (0xff)
</message>
<stderr_txt>
MacOS Error -43 occured in Mac_Lib.c line 64
MacOS Error -43 occured in Mac_Lib.c line 64

</stderr_txt>

So no go, still won\'t run :-((

(computer id now is 273573)
4) Questions and Answers : Macintosh : Can\'t get any decent results, detached from project. (Message 17500)
Posted 28 Nov 2005 by old_user49040
Post:
I\'ve detached my dual G5 from climateprediction; I cannot get any decent results at all. Every day my system downloads new work, gets up to one or two cycles and then boom. Currently I\'ve raked up 48 experiments and 14 of them ran the full course.

None of the sulphur or hadsm 4.13 experiments I\'ve been sent have produced anything. The 14 experiments that did complete where the hadsm 4.12 ones.

All sulphur experiments exit with a status of 255 after one or two cycles and persistently crash in thread 1. Excerpt from crashreporter:

Thread 1 Crashed:
0 libGL.dylib 0x92dff3e0 glDeleteTextures + 48
1 ...r_4.21_powerpc-apple-darwin 0x0001e23c graphics_thread_cleanup + 432 (crt.c:300)
2 ...r_4.21_powerpc-apple-darwin 0x00005a64 app_cleanup() + 36 (crt.c:300)
3 ...r_4.21_powerpc-apple-darwin 0x00006f58 checkBOINCStatus(bool) + 228 (crt.c:300)
4 ...r_4.21_powerpc-apple-darwin 0x000054dc mainLoop() + 76 (crt.c:300)
5 ...r_4.21_powerpc-apple-darwin 0x00004fbc worker() + 1844 (crt.c:300)
6 ...r_4.21_powerpc-apple-darwin 0x0004c810 foobar(void*) + 60 (graphics_impl.C:75)
7 libSystem.B.dylib 0x9002b200 _pthread_body + 96

Weird thing is that I *never* run the visual stuff, boinc always runs in the background so why this thread is even running beats me.

hadsm 4.13 also persistently crashes in the same thread and same function.

hadsm 4.12 did run fine as long as I ran boinc via a shell script, with a ulimit -n 1024, to up the maximum open file limit to 1024 (with the default of 255 it exits with a \'no more file handles\' error).

For those interested, computer in question is 202499.

I\'ll keep monitoring the Mac forum and re-attach when the CPDN team reports that they\'ve found and fixed the problem(s).


To be complete:

MacOS 10.4.3 (but problems arose with 10.4.2 as well)
Dual PowerMac G5, 1.8Ghz
3.5GB RAM (the boinc client itself doesn\'t detect this correctly, it only reports 2GB).
5) Questions and Answers : Unix/Linux : hadsm 4.12 crashed with message: ATM_DYN: NEGATIVE THETA DETECTED (Message 11611)
Posted 4 Apr 2005 by old_user49040
Post:

model: 1n06_000097254, crashed after 3 restarts at timestemp 130268 in phase 1.

yabsd.out shows:

Model aborted with error code - 1 Routine and message:-
ATM_DYN : NEGATIVE THETA DETECTED.

Is this an issue with 4.12 or simply a bad model?

Seems as if the Linux crashes are still not resolved :-(
6) Questions and Answers : Unix/Linux : model 4.12 still crashing in phase 1 :-((( (Message 11592)
Posted 3 Apr 2005 by old_user49040
Post:
Last week 4.12 was released which would address some issues relating to the crashing of the models. After resetting all projects on my linux systems (I've got three), they all started with 4.12, downloaded new projects etc...

But now they are still crashing, and again in phase 1 around the same timestep (around 118000). It's getting really frustrating to see them crashing all the time!

Can't you guys just rollback to 4.10?? March has been completely lost for all participants running Linux hosts. First 4.11 crashing, then the site down during easter weekend and now 4.12 also seem crashing.

boinc (4.19) output itself shows:

1n06_000097254 - PH 1 TS 117922 - 27/09/1817 17:00 - H:M:S=0080:03:37 AVG= 2.44 DLT= 0.00
1n06_000097254 - PH 1 TS 117923 - 27/09/1817 17:30 - H:M:S=0080:03:38 AVG= 2.44 DLT= 1.00
Model crashed...retrying...restart level 0
Preparing for restart...
Rewinding a model-day...
Starting model ID 1n06_000097254 Phase 1
Stack size=48.00 MB
Waiting for model startup, this may take a minute...
1n06_000097254 - PH 1 TS 117793 - 25/09/1817 00:30 - H:M:S=0080:03:39 AVG= 2.45 DLT= 0.00
1n06_000097254 - PH 1 TS 117794 - 25/09/1817 01:00 - H:M:S=0080:03:49 AVG= 2.45 DLT= 9.98

and some time later, followed by:

1n06_000097254 - PH 1 TS 117922 - 27/09/1817 17:00 - H:M:S=0080:09:06 AVG= 2.45 DLT= 1.00
Model crashed...retrying...restart level 1
Preparing for restart...
Rewinding a model-month...
Copying restart files for model retry...
Starting model ID 1n06_000097254 Phase 1
Stack size=48.00 MB
Waiting for model startup, this may take a minute...
1n06_000097254 - PH 1 TS 116641 - 01/09/1817 00:30 - H:M:S=0080:09:07 AVG= 2.47 DLT= 0.00

and the latest crash:

1n06_000097254 - PH 1 TS 133626 - 24/08/1818 21:00 - H:M:S=0091:46:10 AVG= 2.47 DLT= 1.00
Model crashed...retrying...restart level 2
Preparing for restart...
Rewinding a model-year...
Copying restart files for model retry...
Starting model ID 1n06_000097254 Phase 1
Stack size=48.00 MB
Waiting for model startup, this may take a minute...
1n06_000097254 - PH 1 TS 120961 - 01/12/1817 00:30 - H:M:S=0091:46:11 AVG= 2.73 DLT= 0.00

Next crash will cause the model to abort and download a new one.

There is not much else information in the logfiles. The yabs.out file doesn't contain the 'negative pressure' message and all stderr logfiles from the model itself are empty.

Any pointers to get this stable??




©2024 climateprediction.net