climateprediction.net home page
\"client error\" every time

\"client error\" every time

Questions and Answers : Unix/Linux : \"client error\" every time
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user126429

Send message
Joined: 2 Dec 05
Posts: 2
Credit: 0
RAC: 0
Message 18955 - Posted: 3 Jan 2006, 18:30:07 UTC

I\'m trying to run the BOINC version of CPDN, and my results page now shows three results, all with \"client error\" and less than a second of CPU time. I\'ve attached the stderr from one of the results pages, which features error -161, but I don\'t know what that means.

Here are a few factoids about this machine that might be relevant. It\'s a relatively slow machine, a 380 MHz K6-3D. The client I seem to have is sulphur 4.22. I also have the Einstein and LHC projects attached; they seem to be doing fine. My libc version is 2.3.2.

I\'m sure that\'s not enough to go on, but perhaps it someone could clue me in what error -161 is, I could find out more.

stderr:

<core_client_version>5.2.13</core_client_version>
<stderr_txt>

</stderr_txt>
<message><file_xfer_error>
<file_name>sulphur_huci_000832482_0_1.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_huci_000832482_0_2.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_huci_000832482_0_3.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_huci_000832482_0_4.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_huci_000832482_0_5.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>

</message>
ID: 18955 · Report as offensive     Reply Quote
old_user126429

Send message
Joined: 2 Dec 05
Posts: 2
Credit: 0
RAC: 0
Message 18956 - Posted: 3 Jan 2006, 18:33:01 UTC

OK, the BB ate the xml from stderr. Don\'t know how to fix that, but you get the idea.
ID: 18956 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 18962 - Posted: 3 Jan 2006, 19:52:48 UTC
Last modified: 3 Jan 2006, 19:53:21 UTC

To deal with the display problem, you need to use e.g. WinWord to do a replace on the less than/greater than symbols. Use square brackets instead.

The 161 error is a red herring. It just means something like \"the error files BOINC is trying to upload don\'t exist, or are empty\".
The REAL error message is missing. This seems to be a recent \"problem\", and I don\'t know why it is happening, just that it is. A LOT!

If you look in the file yabsd.out, which is in the dataout folder of your model, there should be an error message, or description, at the bottom of the file. THIS will tell you, (or us), what REALLY happened.

The usual causes of failures are overheating, (caused by lack of air flow in the case, and/or dust on the heatsink), overclocking, (the processor just can\'t handle the intense, continuous, calcs at that speed), an agressive AV program which locks files trying to do a write, just so it can check them, (Avast, Antivir), and, I feel, a bare minimum power supply, which is letting the voltages sag under load.

ID: 18962 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 18965 - Posted: 3 Jan 2006, 20:14:42 UTC

Doubtful that a 380 MHz K6 will run any CPDN (Classic/Slab/Sulphur Cycle) Work Unit. Recommended minimum for Sulphur Cycle (all that is available under boinc at the moment) is 1GHz with at least 256k memory but 512 is better. With a minimum machine, the Work Unit would run full-time for months.

Thanks for giving us a try, Dave, and hope to see you back sometime.
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 18965 · Report as offensive     Reply Quote
michelangelo

Send message
Joined: 4 Sep 04
Posts: 1
Credit: 5,893,923
RAC: 0
Message 19077 - Posted: 6 Jan 2006, 10:54:16 UTC - in response to Message 18962.  



If you look in the file yabsd.out, which is in the dataout folder of your model, there should be an error message, or description, at the bottom of the file. THIS will tell you, (or us), what REALLY happened.



I also experience the same kind of problem and don\'t see anything that looks like an error message at the bottom of the yabsd.out file, last lines are in one example:
O_LEN_D1 = 0,
W_LEN2_LOOKUP = 0,
W_LEN_DATA = 1,
W_LEN_D1 = 0,
(1372 lines in those files)
I got this problem with sulphur 4.22 and now 4.23. Where should I look to diagnose the problem?
ID: 19077 · Report as offensive     Reply Quote
old_user139833

Send message
Joined: 17 Dec 05
Posts: 1
Credit: 15,588
RAC: 0
Message 21432 - Posted: 20 Mar 2006, 0:27:46 UTC - in response to Message 18962.  

To deal with the display problem, you need to use e.g. WinWord to do a replace on the less than/greater than symbols. Use square brackets instead.

The 161 error is a red herring. It just means something like \"the error files BOINC is trying to upload don\'t exist, or are empty\".
The REAL error message is missing. This seems to be a recent \"problem\", and I don\'t know why it is happening, just that it is. A LOT!

If you look in the file yabsd.out, which is in the dataout folder of your model, there should be an error message, or description, at the bottom of the file. THIS will tell you, (or us), what REALLY happened.

The usual causes of failures are overheating, (caused by lack of air flow in the case, and/or dust on the heatsink), overclocking, (the processor just can\'t handle the intense, continuous, calcs at that speed), an agressive AV program which locks files trying to do a write, just so it can check them, (Avast, Antivir), and, I feel, a bare minimum power supply, which is letting the voltages sag under load.




I also have client error status on the two returned units. Dell Inspiron 8600, FC4 with BOINC 5.2.13. The current temp is 40deg, not overclocked and Linux doesnt have viruses :). The yabsd.out files dont seem to help, ie:

Script started on Mon 20 Mar 2006 11:23:47 EST
$ for i in `slocate yabsd.out.gz`; do echo $i;zcat $i|tail -10;echo ---------------------------; done
..../BOINC/projects/www.climateprediction.net/sulphur_j3si_200891378/yabsd.out.gz
REPLANCA - time interpolation for field 74
time,time1,time2 3300.000 2880.000 3600.000
hours,int,period 3300 720 8640
Information used in checking ancillary data set:
position of lookup table in dataset: 324
Position of first lookup table referring to data type 20
Interval between lookup tables referring to data type 76
Number of steps 4
STASH code in dataset 123 STASH code requested 123
\'Start\' position of lookup tables for dataset in overall lookup array
---------------------------
..../BOINC/projects/www.climateprediction.net/sulphur_gohl_000778233/yabsd.out.gz
179 1 1 16 222 3332466 7008 1 1
180 1 1 0 210 3339474 7008 1 3
181 1 1 40 23 3346482 7008 1 1
182 1 1 40 24 3353490 7008 1 1
183 1 1 40 31 3360498 7008 1 3
184 1 1 40 178 3367506 7008 1 1
185 1 1 40 203 3374514 7008 1 3
186 1 1 40 220 3381522 7008 1 3
187 1 1 40 221 3388530 7008 1 3
188 3 1 0 401 3395538 140160 20 1
---------------------------
..../BOINC/projects/www.climateprediction.net/sulphur_hu3t_100832169/yabsd.out.gz
J_PE_JFINP2 = -1,
O_NPROC = 1,
IMOUT = 4*0,
JMOUT = 4*0,
J_PE_IND_MED = 4*0,
NMEDLEV = 0
/
SLAB TIMESTEP 2361
3395537 words long
MODEL DUMP SUCCESSFULLY WRITTEN - 3434914 WORDS TO UNIT 22
---------------------------
$ exit

Script done on Mon 20 Mar 2006 11:24:08 EST

What next?

ID: 21432 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,476,323
RAC: 3,882
Message 21468 - Posted: 20 Mar 2006, 23:29:36 UTC - in response to Message 21432.  

I also have client error status on the two returned units. Dell Inspiron 8600, FC4 with BOINC 5.2.13. The current temp is 40deg, not overclocked and Linux doesnt have viruses :). The yabsd.out files dont seem to help, ie:

...

What next?

All sulphur 4.23 models will crash on Linux about half way through phase 1, per the sticky at the top of this forum. Hopefully your PC will download a new coupled model (hadcm3l) which has proven to be more stable in Linux.
ID: 21468 · Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : \"client error\" every time

©2024 climateprediction.net