No trickles on HadCM3 Coupled Model?

Author	Message
old_user444429 Send message Joined: 23 Apr 07 Posts: 2 Credit: 16,959 RAC: 0	Message 28134 - Posted: 26 Apr 2007, 14:26:18 UTC Hi. I just started and my computer has been crunching a HadCM3 5.4 WU for about 14h CPU time now and is some 10 Months into the model. Shouldn\'t it have produced some trickles by now? It says a trickle per month on the web page? Are they just not showing up or is there something going wrong that I should investigate? ID: 28134 · Reply Quote

Les Bayliss Volunteer moderator Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0	Message 28137 - Posted: 26 Apr 2007, 14:39:27 UTC I\'m not sure which web page you\'re looking at, but it could be the one about slab models which were different. The TCMs trickle once a year, on December 4th; a larger amount of data is uploaded as a zip file every 10 years, (along with a trickle), and a restart dump every 40 years. Trickles may take a little while to show up on your Account page if the servers are busy, and the credit program only runs once a day. My 3.2GHz P4, running 24/7, with no other projects, takes a bit over 15 hours per model year. Backups: Here ID: 28137 · Reply Quote

old_user444429 Send message Joined: 23 Apr 07 Posts: 2 Credit: 16,959 RAC: 0	Message 28146 - Posted: 26 Apr 2007, 18:01:46 UTC I was looking at http://climateapps2.oucs.ox.ac.uk/cpdnboinc/quick_faq.php Section labled \'so how long do these trickles take...\' subsection Transient coupled models. That left me with the impression, that it would trickle every month. Thanks for the rectification. ID: 28146 · Reply Quote

Les Bayliss Volunteer moderator Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0	Message 28151 - Posted: 26 Apr 2007, 20:23:12 UTC OK, that\'s under section 1.3.2 And it\'s definately an Oops. We had 3 diferent projects running at about the same time, and the person who did the update to the website back in 2005/6 must have \'gotten 2 pages stuck together\', sort of thing. I\'ll report it to the team. Backups: Here ID: 28151 · Reply Quote

crandles Volunteer moderator Send message Joined: 16 Oct 04 Posts: 692 Credit: 277,679 RAC: 0	Message 28157 - Posted: 26 Apr 2007, 22:14:16 UTC I think that one is my fault. I wrote that for the wiki during beta testing when the trickle frequency was monthly. When the transient model was released the trickle frequency changed to yearly. It took me some time to spot the wiki needed changing and by the time it was brought to my attention, the wiki had been used as a basis for that faq. Sorry about that. http://boinc-wiki.ath.cx/index.php?title=Climateprediction_FAQ#Why_are_the_work_units_so_big.3F probably wants a bit of work again to put the transient model information first and archive the slab model information. (Though there may soon be some slab models again with better parameter values.) Visit BOINC WIKI for help And join BOINC Synergy for all the news in one place. ID: 28157 · Reply Quote

Bellator Send message Joined: 31 Mar 05 Posts: 44 Credit: 234,235 RAC: 0	Message 28891 - Posted: 22 May 2007, 16:36:46 UTC 22/05/2007 6:29:31 PM\|climateprediction.net\|Sending scheduler request: To send trickle-up message 22/05/2007 6:29:31 PM\|climateprediction.net\|(not requesting new work or reporting completed tasks) 22/05/2007 6:30:35 PM\|climateprediction.net\|Scheduler request failed: HTTP internal server error 22/05/2007 6:30:35 PM\|climateprediction.net\|Deferring communication for 22 min 36 sec 22/05/2007 6:30:35 PM\|climateprediction.net\|Reason: scheduler request failed I have had this problem before and had detached as these messages just kept coming. Is the HTTP error something that may eventually be corrected, or am I wasting my computer time by continuing this project? I have done 35 hours and am past the yearend. ID: 28891 · Reply Quote

old_user17525 Send message Joined: 13 Sep 04 Posts: 161 Credit: 284,548 RAC: 0	Message 28893 - Posted: 22 May 2007, 16:57:24 UTC Last modified: 22 May 2007, 16:57:46 UTC Hi Bellator, No, you\'re most definitely not wasting your time. The trickles will go up eventually, multiple trickles if need be, and they\'ll all be counted and useful. There was a server problem a couple of weeks ago but things are ok now so it could be just a temporary glitch. Just let it keep trying until it succeeds but don\'t detach or you\'ll lose the model and all the work that you\'ve already done. _________________________________ ID: 28893 · Reply Quote

MikeMarsUK Volunteer moderator Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0	Message 28894 - Posted: 22 May 2007, 17:27:07 UTC Are you behind a firewall or proxy? I'm a volunteer and my views are my own. News and Announcements and FAQ ID: 28894 · Reply Quote

Bellator Send message Joined: 31 Mar 05 Posts: 44 Credit: 234,235 RAC: 0	Message 28897 - Posted: 22 May 2007, 19:13:50 UTC yes, I have Karpesky, but there is no problem there. I have compared my account with that of others crunching the same wu and it appears we all show the same data, so I will just be very patient this time around. ID: 28897 · Reply Quote

Richard Haselgrove Send message Joined: 1 Jan 07 Posts: 942 Credit: 34,176,368 RAC: 6,226	Message 28914 - Posted: 23 May 2007, 8:11:08 UTC - in response to Message 28894. Last modified: 23 May 2007, 8:32:54 UTC Are you behind a firewall or proxy? Mike, Bellator\'s log Scheduler request failed: HTTP internal server error is significant. It shows that a process failed on the servers at Oxford. There are significant problems with the BOINC back-end code at the moment. SETI@home deployed the \'latest version\' a week ago, and it broke with exactly this error for a significant number of users. The precise mode of failure (anonymous platform mechanism) is unlikely to be relevant for CPDN, but it\'s worth a look. SETI tried to deploy a fix last night, but couldn\'t even get it to run (\"segfaults in the scheduler CGI\" - Eric Korpela). I think it would be a good idea if someone could just check that none of these problems have spilled over into the version of the BOINC server that CPDN currrently uses. I don\'t know my way round the personnel at Oxford well enough to target this message properly - could you pass it on to the right person, please? ID: 28914 · Reply Quote

old_user17525 Send message Joined: 13 Sep 04 Posts: 161 Credit: 284,548 RAC: 0	Message 28915 - Posted: 23 May 2007, 8:31:30 UTC - in response to Message 28914. I don\'t know my way round the personnel at Oxford well enough to trget this message properly - could you pass it on to the right person, please? Done, thanks. _________________________________ ID: 28915 · Reply Quote

MikeMarsUK Volunteer moderator Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0	Message 28919 - Posted: 23 May 2007, 10:56:07 UTC I have a vague memory of a client-side problem causing scheduler errors, this was back when account management systems were first introduced, and something was being corrupted. I'm a volunteer and my views are my own. News and Announcements and FAQ ID: 28919 · Reply Quote

Thyme Lawn Volunteer moderator Send message Joined: 5 Aug 04 Posts: 1283 Credit: 15,824,334 RAC: 0	Message 28921 - Posted: 23 May 2007, 12:37:13 UTC - in response to Message 28914. Last modified: 23 May 2007, 12:42:17 UTC Bellator\'s log Scheduler request failed: HTTP internal server error is significant. It shows that a process failed on the servers at Oxford. It\'s not guaranteed to be a failure at Oxford. Some proxy servers (incorrectly) generate an HTTP 500 response when DNS lookups time out. From RFC 2616: 10.5.5 504 Gateway Timeout The server, while acting as a gateway or proxy, did not receive a timely response from the upstream server specified by the URI (e.g. HTTP, FTP, LDAP) or some other auxiliary server (e.g. DNS) it needed to access in attempting to complete the request. Note: Note to implementors: some deployed proxies are known to return 400 or 500 when DNS lookups time out. "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer ID: 28921 · Reply Quote

Richard Haselgrove Send message Joined: 1 Jan 07 Posts: 942 Credit: 34,176,368 RAC: 6,226	Message 28930 - Posted: 23 May 2007, 18:23:45 UTC Happy to accept the alternative possibilities. But since there has been a recent BOINC server code change (part of the preparations to support the forthcoming BOINC v5.10.x range, now in Alpha testing) which definitely has caused this error, I thought it was worth flagging up - no harm in looking! ID: 28930 · Reply Quote

Bellator Send message Joined: 31 Mar 05 Posts: 44 Credit: 234,235 RAC: 0	Message 28961 - Posted: 25 May 2007, 8:33:29 UTC It is now May 25 and my statistics have not updated since May 19. I have the same problem with SETI - no update since May 22. Both contact the server many times a day. What should I do? Wait and hope it will rectify itself some day or just detach and try something else (I am thinking of opening a bottle of RosÃƒÂ© and sit on my terrace, contemplating this frustrating turn of events). ID: 28961 · Reply Quote

Bellator Send message Joined: 31 Mar 05 Posts: 44 Credit: 234,235 RAC: 0	Message 28962 - Posted: 25 May 2007, 9:17:07 UTC - in response to Message 28961. It is now May 25 and my statistics have not updated since May 19. I have the same problem with SETI - no update since May 22. Both contact the server many times a day. What should I do? Wait and hope it will rectify itself some day or just detach and try something else (I am thinking of opening a bottle of RosÃƒÂ© and sit on my terrace, contemplating this frustrating turn of events). Tne minutes later: SETI has suddenly contacted the server and my statistics have been update. So there\'s hope for CPDN? ID: 28962 · Reply Quote

MikeMarsUK Volunteer moderator Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0	Message 28965 - Posted: 25 May 2007, 13:07:50 UTC Last modified: 25 May 2007, 13:21:07 UTC Possibly. The project servers haven\'t heard anything from your model since the 19th May. Any idea what changed on the 19th May? (new version of something, windows update, different firewall settings, ... ?) * Are you connecting via a proxy server? (usually the answer is \'yes\' if you\'re on a work computer or a university computer, usually \'no\' if this is a home PC). We can talk you through setting up debugging on http transfers, this will give extra information but I\'m not sure if there\'s much point because there\'s not much which can be changed on the Boinc side. It might be worth changing Boinc\'s protocol to HTTP1.0 rather than 1.1, the instructions are in the following post: (it\'s quite tricky so don\'t hesitate to ask for help). http://boincfaq.mundayweb.com/index.php?language=1&view=91 <http_1_0> Set this flag to use HTTP 1.0 instead of 1.1 (this may be needed with some proxies). I'm a volunteer and my views are my own. News and Announcements and FAQ ID: 28965 · Reply Quote

Bellator Send message Joined: 31 Mar 05 Posts: 44 Credit: 234,235 RAC: 0	Message 28966 - Posted: 25 May 2007, 13:22:39 UTC - in response to Message 28965. Possibly. The project servers haven\'t heard anything from your model since the 19th May. Any idea what changed on the 19th May? (new version of something, windows update, different firewall settings, ... ?) * Are you connecting via a proxy server? (usually the answer is \'yes\' if you\'re on a work computer or a university computer, usually \'no\' if this is a home PC). We can talk you through setting up debugging on http transfers, this will give extra information but I\'m not sure if there\'s much point. No, I cannot think of anything special. I have done the upload handler test and I get a -1. I have 20 projects which all update regularly (there was a problem with SETI but that has corrected itself today. The real problem I think is more basic, because this is at least the fourth or fifth time I have restarted CPDN. Each time in the past, the project runs OK for three, four days, then \"freezes\" as it has done now. I use Karpesky, but have not changed any firewall settings. I use a home computer. One more thing: I am running two work units at the same time (insurance, you know) and the one shows \"disturbed parameters\" the other nothing. Right now, each has been going for close to 50 hours. Would hate to detach again. ID: 28966 · Reply Quote

MikeMarsUK Volunteer moderator Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0	Message 29005 - Posted: 27 May 2007, 9:40:06 UTC Could you give that HTTP 1.0 thing a try to see what happens? I'm a volunteer and my views are my own. News and Announcements and FAQ ID: 29005 · Reply Quote

Bellator Send message Joined: 31 Mar 05 Posts: 44 Credit: 234,235 RAC: 0	Message 29009 - Posted: 27 May 2007, 11:13:07 UTC - in response to Message 29005. Could you give that HTTP 1.0 thing a try to see what happens? If this is addressed to me, I am afraid I do not understand... ID: 29009 · Reply Quote