climateprediction.net home page
Cross-project ID's question

Cross-project ID's question

Message boards : Number crunching : Cross-project ID's question
Message board moderation

To post messages, you must log in.

AuthorMessage
Digby

Send message
Joined: 17 Feb 06
Posts: 89
Credit: 4,309,159
RAC: 0
Message 52299 - Posted: 22 Jul 2015, 13:27:06 UTC
Last modified: 22 Jul 2015, 13:38:58 UTC

This may be a naive question...

Over the past week I have been suspending my cpdn project and night and then restarting in the morning.

This morning it restarted and six (of eight) tasks prematurely crashed...

I restored the eight tasks from a backup and rebooted. Boinc restarted ok but the new Boinc Manager Event Log created a new cross-project ID.

Does anyone know if that new cross-project ID is temporary until the restored tasks catch up where the failed tasks last reported to the server? I am one user on the same machine so presumably I should have just one cross-project ID???

Thanks

Suspending network activity - user request
climateprediction.net | project resumed by user
Resuming network activity
climateprediction.net | update requested by user
climateprediction.net | Sending scheduler request: Requested by user.
climateprediction.net | Not requesting tasks: don't need (CPU: job cache full; NVIDIA GPU: no applications)
climateprediction.net | Scheduler request completed
climateprediction.net | Generated new computer cross-project ID: 987c000809c9212e8f54e88cb97e3041
ID: 52299 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 942
Credit: 34,148,205
RAC: 4,219
Message 52302 - Posted: 22 Jul 2015, 16:31:37 UTC - in response to Message 52299.  

Note that the wording is "Generated new computer cross-project ID". This is not the same thing as either the Computer ID shown for your computers on this web site, or the User CPID used by the cross-project statistics sites. So far as I can tell, the computer CPID is of no practical use and changes can be safely ignored.

For peace of mind when restoring from backups, increase the <rpc_seqno> for the project(s) you are restoring in client_state.xml, to a figure greater than the "Number of times client has contacted server" shown in the computer details for the same machine - edit the file before you restart BOINC.

On the other hand, I notice that you do have a new Computer ID: 1370014 shown on your account today. It has no tasks, but the strikingly-similar computer ID: 1362952 has 8 tasks in progress.

If these are the same computer, you should check that your computer's client_state.xml has <hostid> 1362952. If not, re-restore the tasks, and correct both the <hostid> and the <rpc_seqno> as above before resuming computation.
ID: 52302 · Report as offensive     Reply Quote
Digby

Send message
Joined: 17 Feb 06
Posts: 89
Credit: 4,309,159
RAC: 0
Message 52305 - Posted: 22 Jul 2015, 22:13:32 UTC

Thanks for the reply Richard. I looked at the computers on my account and realised that this morning, before I restored the backup, I had updated the graphics drivers from 340.76 to 346.59. The restored client_state.xml was expecting 340.76 and when it saw 346.59 it must have thought this is a new machine and set up a new hostid automatically.

Now that I have a reason, I am not too bothered if Boinc thinks its on a new machine given that I'll probably be upgrading to new Ubuntu versions in future.

Before I read your post my machine had already contacted the server twice...and its done a day's crunching so I think it probably safe to leave things as they are for the moment...but I'll keep an eye on things.

Thanks for your help.
ID: 52305 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 942
Credit: 34,148,205
RAC: 4,219
Message 52308 - Posted: 22 Jul 2015, 22:55:09 UTC - in response to Message 52305.  
Last modified: 22 Jul 2015, 22:57:48 UTC

Thanks for the reply Richard. I looked at the computers on my account and realised that this morning, before I restored the backup, I had updated the graphics drivers from 340.76 to 346.59. The restored client_state.xml was expecting 340.76 and when it saw 346.59 it must have thought this is a new machine and set up a new hostid automatically.

Now that I have a reason, I am not too bothered if Boinc thinks its on a new machine given that I'll probably be upgrading to new Ubuntu versions in future.

Before I read your post my machine had already contacted the server twice...and its done a day's crunching so I think it probably safe to leave things as they are for the moment...but I'll keep an eye on things.

Thanks for your help.

No, please read what I said. I's not your changed graphics driver that matters, it's the <rpc_seqno>, or "Remote Procedure Call - sequence number". When you restored from backup, you must have re-imported an old sequence number. The BOINC server software sees that as at attempt at cheating - trying to boost Recent Average Credit by using two computers in parallel - and responds by assigning a new Host ID.

Looking at your computers on the website, I see

Computer ID ... Last contact
------------------------------------
1370014 ... ... ... 22 Jul 2015 15:30:14 UTC
1362952 ... ... ... 21 Jul 2015 19:56:20 UTC

- implying that the computer now running is using the wrong ID number. That will likely invalidate your running tasks if allowed to continue. Now that you've got this far, it might be easiest to merge the two host records so the tasks are assigned to the correct host before they are completed.
ID: 52308 · Report as offensive     Reply Quote
Digby

Send message
Joined: 17 Feb 06
Posts: 89
Credit: 4,309,159
RAC: 0
Message 52310 - Posted: 23 Jul 2015, 11:10:50 UTC
Last modified: 23 Jul 2015, 11:11:08 UTC

Yes that makes sense. So its the <rpc_seqno> that raises a flag with the server... that confirms the point you made:

"For peace of mind when restoring from backups, increase the <rpc_seqno> for the project(s) you are restoring in client_state.xml, to a figure greater than the "Number of times client has contacted server" shown in the computer details for the same machine - edit the file before you restart BOINC."

I'll try and remember that for the future.

I decided to 'merge computers by name' from my web account panel and this took about 10 seconds and came back confirming what it had done. It listed some names from years gone by that I had forgotten about...:)

Hopefully this has resolved the issue.

Thanks for the advice.
ID: 52310 · Report as offensive     Reply Quote

Message boards : Number crunching : Cross-project ID's question

©2024 climateprediction.net