climateprediction.net home page
\'Client detached\' redesignation of models

\'Client detached\' redesignation of models

Message boards : Number crunching : \'Client detached\' redesignation of models
Message board moderation

To post messages, you must log in.

AuthorMessage
Helmer Bryd

Send message
Joined: 16 Aug 04
Posts: 147
Credit: 7,748,561
RAC: 8,366
Message 37612 - Posted: 2 Aug 2009, 9:36:33 UTC

My hadcm3 control run shows trickles alright but the graph disappeared and server state says "over" and "client detached", although it's still running fine in year 2062.
Hope it comes back, was fun to see this cold model: 8390738
ID: 37612 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 37614 - Posted: 2 Aug 2009, 11:16:38 UTC
Last modified: 2 Aug 2009, 11:34:43 UTC

Hi cwhyl

I hope you don't mind that I have moved your post and this one of mine to this new thread. This problem deserves its own space for a proper discussion.

I know of two conditions that can cause a task to be reclassified as 'Client detached':

- after the user has restored a backup
- after the user has merged computer records

Did you do either of these?

There was recently a message about this on the boinc_dev email list from Josef W Segur who AFAIK is one of the people who write optimised apps for Seti. He said that some Seti users' tasks were being apparently spontaneously reclassified as 'Client detached'. If I understood the email discussion correctly the reclassification occurs when Boinc's host lookup by host ID fails (I imagine it fails only momentarily).

I pointed out that this redesignation of tasks is a nuisance because from the redesignation moment onwards some of the information on tasks' web pages fails to update or becomes unavailable, eg graphs as you've already noticed. When the model finishes it won't get any messages in its web page stderr out either. It will still receive correct credits though so it's clear that the server has regained proper contact with the computer and task. Another problem is that some users who see 'Client detached' and 'Over' must assume that the model has failed in some way and may abort it. Redesignated models are in fact perfectly OK and crunch on robustly.

David Anderson said he'd checked in a fix for this problem but didn't say what the fix consisted of. I assume this fix will take effect in the next alpha or public release version of BOINC.
Cpdn news
ID: 37614 · Report as offensive     Reply Quote
Ingleside

Send message
Joined: 5 Aug 04
Posts: 108
Credit: 18,267,398
RAC: 35,346
Message 37643 - Posted: 5 Aug 2009, 19:18:57 UTC - in response to Message 37614.  

David Anderson said he'd checked in a fix for this problem but didn't say what the fix consisted of. I assume this fix will take effect in the next alpha or public release version of BOINC.

The fix is not in the client but in server-code, so will take effect next time CPDN upgrades their scheduling-server.

ID: 37643 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 37644 - Posted: 5 Aug 2009, 20:31:03 UTC

Thanks for the info. The sooner the better as far as I'm concerned, but I believe that installing a new server version takes two programmers many hours (one customising the code and the other documenting every setting) so they won't be in a hurry.
Cpdn news
ID: 37644 · Report as offensive     Reply Quote
old_user85070

Send message
Joined: 27 Jun 05
Posts: 6
Credit: 1,733,861
RAC: 0
Message 37858 - Posted: 21 Aug 2009, 4:10:12 UTC - in response to Message 37614.  

Hi cwhyl

I hope you don\'t mind that I have moved your post and this one of mine to this new thread. This problem deserves its own space for a proper discussion.

I know of two conditions that can cause a task to be reclassified as \'Client detached\':

- after the user has restored a backup
- after the user has merged computer records

Did you do either of these?

There was recently a message about this on the boinc_dev email list from Josef W Segur who AFAIK is one of the people who write optimised apps for Seti. He said that some Seti users\' tasks were being apparently spontaneously reclassified as \'Client detached\'. If I understood the email discussion correctly the reclassification occurs when Boinc\'s host lookup by host ID fails (I imagine it fails only momentarily).

I pointed out that this redesignation of tasks is a nuisance because from the redesignation moment onwards some of the information on tasks\' web pages fails to update or becomes unavailable, eg graphs as you\'ve already noticed. When the model finishes it won\'t get any messages in its web page stderr out either. It will still receive correct credits though so it\'s clear that the server has regained proper contact with the computer and task. Another problem is that some users who see \'Client detached\' and \'Over\' must assume that the model has failed in some way and may abort it. Redesignated models are in fact perfectly OK and crunch on robustly.

David Anderson said he\'d checked in a fix for this problem but didn\'t say what the fix consisted of. I assume this fix will take effect in the next alpha or public release version of BOINC.


I now have 4 client detached after 2 restores from backups. I have an iceworld slab so I tried to restore to earlier version 19 but ended up with client detached. when I saw it was over I started a new model then when I couldn\'t get new model I restored the old 3 models and I am close to finishing the 2 slabs 1 of which is the iceworld. I will attempt to finish all.
Dibb
ID: 37858 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 37859 - Posted: 21 Aug 2009, 9:38:54 UTC

Hi Dibb

Yes, models that are redesignated \'client detached\' are perfectly all right apart from this and should be completed.

Except that you have an iceworld (though the cause of that is not really understood yet - it has nothing to do with the detached label).

Iceworlds should not be continued as the data they produce from that point onwards will be incomplete. You need to abort it.

But could you please not delete your last backup containing the iceworld before it became an iceworld. One of the moderators, Iain Inglis, may well be interested as he\'s been producing some very sophisticated graphics showing where, geographically, iceworlds start.

Could you please unhide your computer (setting in your account) so we can take a look at your iceworld trickles. After two or three days you coud hide it again if yu wish.
Cpdn news
ID: 37859 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 37892 - Posted: 24 Aug 2009, 9:46:53 UTC - in response to Message 37891.  
Last modified: 24 Aug 2009, 9:55:23 UTC

My detached WU still show as new even though I have aborted the Iceworld and finished the other Slab. Do you have any follow up as to when it will show as completed?
Dibb

Annoyingly, the work units shown as \'client detached\' will always show that. However, the project server happily accepts the uploaded data and trickles - and full credits are awarded.

There is a way to stop the \'client detached\' message appearing when a backup is restored. The part of BOINC that runs on your PC maintains a count of the number of times it has contacted the project\'s part of BOINC. The \'client detached\' message appears because the restored installation attempts to contact the project using a value for this count that is lower than the project server expects. The \'client detached\' message is really an \'out of sequence communication\' message.

The count is stored in the client_state.xml file in the top BOINC folder (e.g. C:\\Documents and Settings\\All Users\\Application Data\\BOINC). The relevant part of the file looks like this:

<rpc_seqno>2697</rpc_seqno>

The fix is then to make a note of the value for climateprediction.net from client_state.xml in your current installation, restore the backup and change the value in the restored backup\'s version of client_state.xml to the current value. The project server won\'t then realise a backup has been restored and the \'client detached\' message will not appear. (Don\'t attempt this when BOINC is running because the client_state.xml file is constantly updated.)

This is all rather fiddly. However, BOINC doesn\'t really understand the concept of backups and that\'s the root of the problem. CPDN has long work units where backups make sense: most projects have much shorter work units that simply aren\'t worth backing up.

Iain
ID: 37892 · Report as offensive     Reply Quote

Message boards : Number crunching : \'Client detached\' redesignation of models

©2024 climateprediction.net