climateprediction.net home page
Unable to communicate with project server...

Unable to communicate with project server...

Questions and Answers : Windows : Unable to communicate with project server...
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user221094

Send message
Joined: 19 Jan 07
Posts: 9
Credit: 2,233,821
RAC: 0
Message 29450 - Posted: 5 Jul 2007, 13:07:50 UTC

I\'m getting the following set of messages on two of my machines (all the others are fine). Any ideas what may be happening, what machine is it actually trying to contact for instance...

05/07/2007 13:36:22|climateprediction.net|Sending scheduler request: Requested by user
05/07/2007 13:36:22|climateprediction.net|(not requesting new work or reporting completed tasks)
05/07/2007 13:36:23||Project communication failed: attempting access to reference site
05/07/2007 13:36:24||Access to reference site succeeded - project servers may be temporarily down.
05/07/2007 13:36:25|climateprediction.net|Scheduler request failed: failed sending data to the peer
05/07/2007 13:36:25|climateprediction.net|Deferring communication for 24 min 21 sec
05/07/2007 13:36:25|climateprediction.net|Reason: scheduler request failed


Any help gratefully appreciated!
--Richard
ID: 29450 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 29451 - Posted: 5 Jul 2007, 13:33:46 UTC


Hi Richard

Which ever server it is, trickle or upload, (upload if you have a zip file in the Transfers tab), one of them is down at present.
You can see the status by clicking on \"Server Stats in the menu to the left of here.

ID: 29451 · Report as offensive     Reply Quote
old_user221094

Send message
Joined: 19 Jan 07
Posts: 9
Credit: 2,233,821
RAC: 0
Message 29452 - Posted: 5 Jul 2007, 14:14:54 UTC
Last modified: 5 Jul 2007, 14:18:49 UTC

This would be trickle, no files to upload. Hmm, all servers SAY they\'re up. Wonder if I have got something corrupt that\'s directing me to an invalid server or something....

--Richard

(edit) Hmm, netstat tells me that I tried to contact a machine called \'targhee.open.ac.uk\' Is this expected?
ID: 29452 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 29453 - Posted: 5 Jul 2007, 15:52:22 UTC - in response to Message 29452.  
Last modified: 5 Jul 2007, 15:55:48 UTC

Hmm, netstat tells me that I tried to contact a machine called \'targhee.open.ac.uk\' Is this expected?

Definitely not. Scheduler requests should be going to climateapps2.oucs.ox.ac.uk

Check the file master_climateprediction.net.xml in your BOINC directory. Line 14 in the file should read as follows:

      <scheduler> http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi </scheduler>


BOINC requests a reload of the file after 10 scheduler request failures, so you could always force a reload by doing a few manual project updates.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 29453 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 29454 - Posted: 5 Jul 2007, 16:42:40 UTC
Last modified: 5 Jul 2007, 16:44:01 UTC

An upload server certainly isn\'t running at the moment:

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/server_status.php

The \'targhee.open.ac.uk\' server will be the one at the Open Uni in Milton Keynes that hosts the independent forum. You must have looked in there.

Just suspend network activity and wait......
Cpdn news
ID: 29454 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 29456 - Posted: 5 Jul 2007, 21:49:05 UTC


The server that was down (uploadatm), is now working again.

Just to be quite clear about it, THIS is the page where you should end up after clicking on the Server Stats link on the left.

ID: 29456 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 29457 - Posted: 5 Jul 2007, 23:42:43 UTC

It should really say Server Status. I have a list somewhere of typos that members have pointed out on the cpdn website but the guys in Oxford seem so busy that I\'ve never dared send it.
Cpdn news
ID: 29457 · Report as offensive     Reply Quote
old_user221094

Send message
Joined: 19 Jan 07
Posts: 9
Credit: 2,233,821
RAC: 0
Message 29461 - Posted: 6 Jul 2007, 12:00:24 UTC

Definitely a mystery. The scheduler listed in the master file is indeed climateapps2, and if I ping it, the ping works. The server status for it seems to be green.. but still I can\'t connect :(. The machine next to it works just fine.

I\'m stumped I\'m afraid :(

--Richard
ID: 29461 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 29463 - Posted: 6 Jul 2007, 16:59:30 UTC


Do the two machines have the same firewall and firewall settings? (The other thing to check is proxy servers, but most home machines don\'t use them).
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 29463 · Report as offensive     Reply Quote
old_user422920
Avatar

Send message
Joined: 5 Nov 06
Posts: 24
Credit: 548,923
RAC: 0
Message 29487 - Posted: 9 Jul 2007, 13:12:08 UTC
Last modified: 9 Jul 2007, 13:25:16 UTC

Same problem here :
9-7-2007 15:09:27|climateprediction.net|Sending scheduler request: Requested by user
9-7-2007 15:09:27|climateprediction.net|(not requesting new work or reporting completed tasks)
9-7-2007 15:09:50||Project communication failed: attempting access to reference site
9-7-2007 15:09:51||Access to reference site succeeded - project servers may be temporarily down.
9-7-2007 15:09:53|climateprediction.net|Scheduler request failed: couldn\'t connect to server
9-7-2007 15:09:53|climateprediction.net|Deferring communication for 1 min 0 sec
9-7-2007 15:09:53|climateprediction.net|Reason: scheduler request failed

even 20 manual tries and all the automatic tries to connect ... nothing.
i was allready wondering why i suddenly get lower much lower credits all the time.
the only thing that has changed firewall/settings/programs wise is just that i installed 5.10.13 because the lower versions where buggy.
ID: 29487 · Report as offensive     Reply Quote
old_user221094

Send message
Joined: 19 Jan 07
Posts: 9
Credit: 2,233,821
RAC: 0
Message 29490 - Posted: 9 Jul 2007, 14:41:38 UTC

Well, I fixed my problem, somehow :). It was indeed (trying) to communicate with targhee, not climateapps2, why I have no idea, since in the master_climateprediction.xml file the scheduler was specified as climateapps2. However it had something to do with the client_state.xml file. Just to try, I restored a backup from before when the problem started and sure enough the problem went away. Now when I did an update it tried to connect to climateapps2 and all was fine. So I restored the latest backup, but overlaid that with the climate_state.xml file from the working backup, with a little hand editing to update fields within it. Now when I fired it up, it was back to its latest point and when i did an update it connected fine and uploaded the 4 queues trickles it had by that time. Unfortunate side effect though is that my server status is now Over Unknown New (but that\'s no worse than restoring from a backup anyway)

But how did \'targhee\' get embedded in that file? And why was BOINC stubbornly trying to use that instead of climateapps2. That\'s still a mystery, but at least
I\'m running again now!
--Richard
ID: 29490 · Report as offensive     Reply Quote
Profile Strathpeffer
Avatar

Send message
Joined: 9 Jan 07
Posts: 497
Credit: 342,899
RAC: 0
Message 29492 - Posted: 9 Jul 2007, 15:55:52 UTC

Well done, Richard - the \"Backup King\" does it again! ;-)
Visit the Scotland team
ID: 29492 · Report as offensive     Reply Quote
old_user422920
Avatar

Send message
Joined: 5 Nov 06
Posts: 24
Credit: 548,923
RAC: 0
Message 29495 - Posted: 9 Jul 2007, 16:41:17 UTC
Last modified: 9 Jul 2007, 16:49:49 UTC

well i checked your solution but no go in this file the link is :
<scheduler_url>http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi</scheduler_url>
If i copy this into my browser i have no problem at all to connect to that server and page and it shows me the following info:

<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>
- <scheduler_reply>
<scheduler_version>509</scheduler_version>
<master_url>http://climateprediction.net/</master_url>
<message priority=\"low\">Incomplete request received.</message>
</scheduler_reply>

So i guess something else is buggy on cpdn because my other machine has now contacted this server as well
So any one else some bright ideas besides firewall or proxy (not the problem anyway)

I stopped this project because its allready more then 100 hours running and i see no point in running until the problem is fixed
This is the 5th time i have trouble running cpdn.
ID: 29495 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 29496 - Posted: 9 Jul 2007, 17:06:17 UTC - in response to Message 29490.  

But how did \'targhee\' get embedded in that file? And why was BOINC stubbornly trying to use that instead of climateapps2.

When the master file is downloaded BOINC extracts all the <scheduler> tags and writes them to the <project> section of client_state.xml (as <scheduler_url> tags and after clearing out the current tags). So the files should always have the same set of scheduler names. BOINC writes to client_state.xml in many places (e.g. after every checkpoint) and the in-memory copy of the project scheduler set will be written to the file. Which suggests that something caused the real scheduler value to be overwritten.

What\'s recorded in stdoutdae.txt between the start of the last successful scheduler request and the first failure?
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 29496 · Report as offensive     Reply Quote
old_user422920
Avatar

Send message
Joined: 5 Nov 06
Posts: 24
Credit: 548,923
RAC: 0
Message 29498 - Posted: 9 Jul 2007, 18:18:04 UTC

Me amazed i have no clue what happened but my machine needed a reboot for an update from my burning software.
After the reboot the cpdn finally found the right server again i don\'t see any significant changes in the xml files maybe it got messed because i have cpdn seasonal running as well.
Seems the problem got solved by itself
ID: 29498 · Report as offensive     Reply Quote
old_user221094

Send message
Joined: 19 Jan 07
Posts: 9
Credit: 2,233,821
RAC: 0
Message 29532 - Posted: 13 Jul 2007, 11:11:53 UTC

Sorry about the slow reply, been sick for the last few days.

Can\'t see anything odd in stdoutae.txt between the last good trickle and the first fail

2007-07-02 13:59:50 [climateprediction.net] Restarting task hadcm3inct_cmmd_1920_160_25869521_2 using hadcm3i version 540
2007-07-03 00:08:55 [climateprediction.net] Sending scheduler request: To send trickle-up message
2007-07-03 00:08:55 [climateprediction.net] (not requesting new work or reporting completed tasks)
2007-07-03 00:09:00 [climateprediction.net] Scheduler RPC succeeded [server version 509]
2007-07-03 02:12:19 [---] Exit requested by user

To pause/resume tasks hit CTRL-C, to exit hit CTRL-BREAK
2007-07-03 02:12:31 [---] Starting BOINC client version 5.8.16 for windows_intelx86
2007-07-03 02:12:31 [---] log flags: task, file_xfer, sched_ops
2007-07-03 02:12:31 [---] Libraries: libcurl/7.16.0 OpenSSL/0.9.8a zlib/1.2.3
2007-07-03 02:12:31 [---] Data directory: C:\\Program Files\\BOINC
2007-07-03 02:12:31 [---] Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 CPU 2.40GHz [x86 Family 15 Model 2 Stepping 7] [fpu tsc sse mmx]
2007-07-03 02:12:31 [---] Memory: 1021.99 MB physical, 2.40 GB virtual
2007-07-03 02:12:31 [---] Disk: 74.50 GB total, 14.00 GB free
2007-07-03 02:12:31 [climateprediction.net] URL: http://climateprediction.net/; Computer ID: 675496; location: (none); project prefs: default
2007-07-03 02:12:31 [---] General prefs: from http://bbc.cpdn.org/ (last modified 2006-03-23 11:26:33)
2007-07-03 02:12:31 [---] Host location: none
2007-07-03 02:12:31 [---] General prefs: using your defaults
2007-07-03 02:12:31 [climateprediction.net] Restarting task hadcm3inct_cmmd_1920_160_25869521_2 using hadcm3i version 540
2007-07-03 14:23:33 [---] Exit requested by user

To pause/resume tasks hit CTRL-C, to exit hit CTRL-BREAK
2007-07-03 14:23:47 [---] Starting BOINC client version 5.8.16 for windows_intelx86
2007-07-03 14:23:47 [---] log flags: task, file_xfer, sched_ops
2007-07-03 14:23:47 [---] Libraries: libcurl/7.16.0 OpenSSL/0.9.8a zlib/1.2.3
2007-07-03 14:23:47 [---] Data directory: C:\\Program Files\\BOINC
2007-07-03 14:23:47 [---] Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 CPU 2.40GHz [x86 Family 15 Model 2 Stepping 7] [fpu tsc sse mmx]
2007-07-03 14:23:47 [---] Memory: 1021.99 MB physical, 2.40 GB virtual
2007-07-03 14:23:47 [---] Disk: 74.50 GB total, 13.84 GB free
2007-07-03 14:23:47 [climateprediction.net] URL: http://climateprediction.net/; Computer ID: 675496; location: (none); project prefs: default
2007-07-03 14:23:47 [---] General prefs: from http://bbc.cpdn.org/ (last modified 2006-03-23 11:26:33)
2007-07-03 14:23:47 [---] Host location: none
2007-07-03 14:23:47 [---] General prefs: using your defaults
2007-07-03 14:23:47 [climateprediction.net] Restarting task hadcm3inct_cmmd_1920_160_25869521_2 using hadcm3i version 540
2007-07-04 02:37:41 [---] Exit requested by user

To pause/resume tasks hit CTRL-C, to exit hit CTRL-BREAK
2007-07-04 02:37:54 [---] Starting BOINC client version 5.8.16 for windows_intelx86
2007-07-04 02:37:54 [---] log flags: task, file_xfer, sched_ops
2007-07-04 02:37:54 [---] Libraries: libcurl/7.16.0 OpenSSL/0.9.8a zlib/1.2.3
2007-07-04 02:37:54 [---] Data directory: C:\\Program Files\\BOINC
2007-07-04 02:37:54 [---] Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 CPU 2.40GHz [x86 Family 15 Model 2 Stepping 7] [fpu tsc sse mmx]
2007-07-04 02:37:54 [---] Memory: 1021.99 MB physical, 2.40 GB virtual
2007-07-04 02:37:54 [---] Disk: 74.50 GB total, 13.72 GB free
2007-07-04 02:37:54 [climateprediction.net] URL: http://climateprediction.net/; Computer ID: 675496; location: (none); project prefs: default
2007-07-04 02:37:54 [---] General prefs: from http://bbc.cpdn.org/ (last modified 2006-03-23 11:26:33)
2007-07-04 02:37:54 [---] Host location: none
2007-07-04 02:37:54 [---] General prefs: using your defaults
2007-07-04 02:37:54 [climateprediction.net] Restarting task hadcm3inct_cmmd_1920_160_25869521_2 using hadcm3i version 540
2007-07-04 05:27:21 [climateprediction.net] Sending scheduler request: To send trickle-up message
2007-07-04 05:27:21 [climateprediction.net] (not requesting new work or reporting completed tasks)
2007-07-04 05:27:22 [---] Project communication failed: attempting access to reference site
2007-07-04 05:27:23 [---] Access to reference site succeeded - project servers may be temporarily down.
2007-07-04 05:27:26 [climateprediction.net] Scheduler request failed: failed sending data to the peer
2007-07-04 05:27:26 [climateprediction.net] Deferring communication for 1 min 0 sec

Note that the exits are caused by backups (which happen every 12 hours)

--Richard
ID: 29532 · Report as offensive     Reply Quote

Questions and Answers : Windows : Unable to communicate with project server...

©2024 climateprediction.net