Posts by staffann

1) Message boards : Number crunching : Upload problems (Message 41963) Posted 10 Apr 2011 by staffann Post: The files are uploading now. All that was needed was to restart the daemon: sudo /etc/init.d/boinc-client restart No idea why it was necessary, but happy that it worked. Thanks for the help
2) Message boards : Number crunching : Upload problems (Message 41962) Posted 10 Apr 2011 by staffann Post: Yes, there is the message about not being able to access the internet. It still appears after every attempt to upload a climateprediction file, but at the same time all other project communicate just fine. If I use ssh to log onto the server, I see that I can access the internet from it (and even ping kraken as previously mentioned). There are now 31 files waiting to upload, all from climateprediction.
3) Message boards : Number crunching : Upload problems (Message 41959) Posted 10 Apr 2011 by staffann Post: Thank you for the replies. I ran grep on the client_state.xml file. I couldn't find any misspelt file_upload_handler. All of the references (and there were many) were to kraken so I pinged kraken from the server, and it worked perfectly. I made the client_state file available here: http://staffannilsson.eu/Unrelated/client_state.xml I'm at loss how to proceed.
4) Message boards : Number crunching : Upload problems (Message 41939) Posted 9 Apr 2011 by staffann Post: I noticed that my 24/7 linux server had not gotten any credits from climateprediction for a while, so I thought I'd look and see what's happening. It seems it cannot upload result files despite the server status on you page saying ok. The messages from boinc are posted below. As you can see, other projects communicate just fine. Any suggestions? Thanks! 2011-04-09 16:48:01 rosetta@home Reporting 1 completed tasks, not requesting new tasks 2011-04-09 16:48:02 rosetta@home Started upload of mem_tid3_run06_A_1afo_SAVE_ALL_OUT_IGNORE_THE_REST_22930_14947_0_0 2011-04-09 16:48:06 rosetta@home Scheduler request completed 2011-04-09 16:48:09 rosetta@home Finished upload of mem_tid3_run06_A_1afo_SAVE_ALL_OUT_IGNORE_THE_REST_22930_14947_0_0 2011-04-09 16:48:11 World Community Grid Sending scheduler request: To fetch work. 2011-04-09 16:48:11 World Community Grid Reporting 3 completed tasks, requesting new tasks 2011-04-09 16:48:16 World Community Grid Scheduler request completed: got 1 new tasks 2011-04-09 16:48:18 World Community Grid Started download of E201784_622_C.22.C20H14N2.00010713.3.set1d06_C.22.C20H14N2.00010713.3.zip 2011-04-09 16:48:21 World Community Grid Finished download of E201784_622_C.22.C20H14N2.00010713.3.set1d06_C.22.C20H14N2.00010713.3.zip 2011-04-09 17:05:00 climateprediction.net Started upload of famous_xaby_1999_200_007075221_2_10.zip 2011-04-09 17:05:00 climateprediction.net Started upload of famous_voj9_999_200_006734889_5_2.zip 2011-04-09 17:05:24 Project communication failed: attempting access to reference site 2011-04-09 17:05:24 climateprediction.net Temporarily failed upload of famous_xaby_1999_200_007075221_2_10.zip: connect() failed 2011-04-09 17:05:24 climateprediction.net Backing off 1 hr 38 min 10 sec on upload of famous_xaby_1999_200_007075221_2_10.zip 2011-04-09 17:05:24 climateprediction.net Temporarily failed upload of famous_voj9_999_200_006734889_5_2.zip: connect() failed 2011-04-09 17:05:24 climateprediction.net Backing off 2 hr 42 min 44 sec on upload of famous_voj9_999_200_006734889_5_2.zip 2011-04-09 17:05:47 BOINC can't access Internet - check network connection or proxy configuration. 2011-04-09 18:06:01 World Community Grid Computation for task dg01_c002_pr56b1_0 finished 2011-04-09 18:06:01 World Community Grid Starting E201784_622_C.22.C20H14N2.00010713.3.set1d06_0 2011-04-09 18:06:01 World Community Grid Starting task E201784_622_C.22.C20H14N2.00010713.3.set1d06_0 using cep2 version 640 2011-04-09 18:06:03 World Community Grid Started upload of dg01_c002_pr56b1_0_0 2011-04-09 18:06:03 World Community Grid Started upload of dg01_c002_pr56b1_0_1 2011-04-09 18:06:10 World Community Grid Finished upload of dg01_c002_pr56b1_0_0 2011-04-09 18:06:10 World Community Grid Started upload of dg01_c002_pr56b1_0_2 2011-04-09 18:06:11 World Community Grid Finished upload of dg01_c002_pr56b1_0_1 2011-04-09 18:06:11 World Community Grid Finished upload of dg01_c002_pr56b1_0_2
5) Message boards : climateprediction.net Science : Climate Change - The Game (Message 39106) Posted 1 Mar 2010 by staffann Post: The game was pretty fun but took a long time. Results? Environment 80% (very low level of carbon emissions), Wealth 4% (economy in ruins...) and popularity 100%. I don\'t think the wealth and popularity figures are a possible combination in real life...
6) Message boards : Number crunching : Too many total results (Message 39104) Posted 1 Mar 2010 by staffann Post: I asked the same question a few years ago and suggested that the numbers should be changed so that users aren\'t confused or worried by irrelevant error messages. Let\'s hope something is done about it this time! Should be easy to do.
7) Questions and Answers : Preferences : Resource share problem (Message 38609) Posted 1 Jan 2010 by staffann Post: I still have this problem when I try to change the resource allocation from BAM. BAM reports \"incorrect response from project\" or similar - I used the Swedish language version of BAM so the real english language error message may differ.
8) Message boards : Number crunching : When to abort 160 year coupled model (Message 32538) Posted 8 Feb 2008 by staffann Post: Models of the generation you\'re running trickle every model year, make a larger upload every 10 model years and a full restart dump every 40 model years. Your two models are at 2001 and 2054 and the machine is a fast one. If you\'re going to abort one and leave the other running, then I would abort the 2001 model and let the other finish. However, you have a dual-core machine so they will each be running at nearly full speed - so aborting one will not speed the other one up by much, in case that\'s what you were hoping. The idea was indeed to abort the one at 2001. The problem is not the speed of the computer as much as the fact that I cannot leave it on. Its low uptime means that it takes forever for it to finish a model. These models have crunched since 2006 (!) and I want to be able to do some other projects as well. Since a full restart dump is uploaded every 40 years, the model I intend to abort has only run 1 year after that. It seems to me that I might as well abort it where it is. I\'ll finish my other model and let the other core crunch some more WCG and Rosetta workunits. I\'ll also make sure my CPDN preferences are to download only shorted models (maybe even slab only). Thank you!
9) Message boards : Number crunching : When to abort 160 year coupled model (Message 32531) Posted 8 Feb 2008 by staffann Post: My computer has crunched two 160 year models for quite some time, but since it is only on every once in a while I\'d like to abort one of them. My question is when it is a good time to abort it. Does it matter? Should I wait until a certain model time? The model is just past 50%. I think I\'ve read the answer some time very long ago, but I cannot find the info now so please help. /Staffan
10) Questions and Answers : Windows : \"Overcommitted\" computer never runs CPDN (Message 25205) Posted 20 Nov 2006 by staffann Post: Thanks! I guess the easiest solution is just to let CPDN download two workunits. I\'ll give that a try!
11) Questions and Answers : Windows : \"Overcommitted\" computer never runs CPDN (Message 25170) Posted 18 Nov 2006 by staffann Post: Thanks for the answer. BOINC is operating as it is designed to do. Unfortunately, it isn\'t always the way we want it to. Boinc is worried that CPDN may not be finished in time. As a way of solving it, it makes sure that CPDN never runs. No, that isn\'t exactly the way I would want it to work! Is there any number in the client_state of other file that I can change in order to make the computer believe (realize) that it isn\'t overcommited as a result of CPDN?
12) Questions and Answers : Windows : \"Overcommitted\" computer never runs CPDN (Message 25164) Posted 18 Nov 2006 by staffann Post: Some time ago I updated my BOINC client to 5.4.9. Now it always says that the computer is overcommitted and won\'t run my CPDN model. I have a dual core (AMD Athlon 64 3800+ X2) processor, and I want to run CPDN on one core and WCG on the other. In the past I have managed to do that by having CPDN allocate somewhat above 50% of the resource share, letting CPDN download a model, and then stop CPDN from downloading more job. Since the BOINC client upgrade, there is always a message that the computer is overcommitted and is using earliest deadline first. That means that it will always choose to run WCG (or malariacontrol) on both cores (it downloads two WCG/malariacontrol at the same time). I thought at first that it was because it needed to learn how much time it takes to process a WCG workunit, but since it\'s been doing lots and lots of them by now, that seems unlikely. Could it instead by the CPDN that is causing the overcommitted message (even though the deadline is way over in 2008)? I enclose the first lines from the BOINC manager: 2006-11-18 13:24:19\|\|Starting BOINC client version 5.4.9 for windows_intelx86 2006-11-18 13:24:19\|\|libcurl/7.15.3 OpenSSL/0.9.8a zlib/1.2.3 2006-11-18 13:24:19\|\|Data directory: C:\\Program\\BOINC 2006-11-18 13:24:21\|SETI@home\|Found app_info.xml; using anonymous platform 2006-11-18 13:24:22\|\|Processor: 2 AuthenticAMD AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ 2006-11-18 13:24:22\|\|Memory: 1023.23 MB physical, 3.90 GB virtual 2006-11-18 13:24:22\|\|Disk: 76.32 GB total, 8.87 GB free 2006-11-18 13:24:22\|CPDN Seasonal Attribution Project\|URL: http://attribution.cpdn.org/; Computer ID: 311; location: ; project prefs: default 2006-11-18 13:24:22\|climateprediction.net\|URL: http://climateprediction.net/; Computer ID: 239124; location: home; project prefs: default 2006-11-18 13:24:22\|SETI@home\|URL: http://setiathome.berkeley.edu/; Computer ID: 1497356; location: ; project prefs: default 2006-11-18 13:24:22\|malariacontrol.net beta\|URL: http://www.malariacontrol.net/; Computer ID: 4496; location: home; project prefs: default 2006-11-18 13:24:22\|World Community Grid\|URL: http://www.worldcommunitygrid.org/; Computer ID: 2536; location: Default; project prefs: default 2006-11-18 13:24:22\|\|General prefs: from World Community Grid (last modified 2005-12-12 07:37:53) 2006-11-18 13:24:22\|\|General prefs: no separate prefs for Default; using your defaults 2006-11-18 13:24:22\|\|Local control only allowed 2006-11-18 13:24:22\|\|Listening on port 31416 2006-11-18 13:24:23\|climateprediction.net\|Deferring task hadcm3lbm_agup_25257668_0 2006-11-18 13:24:23\|World Community Grid\|Resuming task faah0952_d071n043_x1AJX_00_0 using faah version 528 2006-11-18 13:24:24\|World Community Grid\|Resuming task faah0952_d071n044_x1AJX_00_2 using faah version 528 2006-11-18 13:24:24\|\|Using earliest-deadline-first scheduling because computer is overcommitted. 2006-11-18 13:24:24\|\|Suspending work fetch because computer is overcommitted. 2006-11-18 13:24:26\|\|Suspending computation - user is active 2006-11-18 13:24:26\|World Community Grid\|Pausing task faah0952_d071n043_x1AJX_00_0 (left in memory) 2006-11-18 13:24:26\|World Community Grid\|Pausing task faah0952_d071n044_x1AJX_00_2 (left in memory) 2006-11-18 13:24:26\|\|Suspending network activity - user is active 2006-11-18 13:25:47\|\|Resuming computation 2006-11-18 13:25:47\|\|Rescheduling CPU: Resuming computation 2006-11-18 13:25:47\|\|Resuming network activity 2006-11-18 13:25:47\|World Community Grid\|Resuming task faah0952_d071n043_x1AJX_00_0 using faah version 528 2006-11-18 13:25:47\|World Community Grid\|Resuming task faah0952_d071n044_x1AJX_00_2 using faah version 528 SETI and seasonal attrib are set not to get any new workunits and therefore never runs. If I look in the client_state file, I see that the long term debt is a large positive number for CPDN, and negative for malaria and WCG.
13) Questions and Answers : Windows : Automatic Backup any good? (Message 20455) Posted 18 Feb 2006 by staffann Post: I use this script to make automatic backups. It is run on a swedish WinXP, so you\'ll have to modify it for your language (the path to the BOINC folder, the name of the ntbackup window that is \"sÃ¤kerhetskopiering\" in swedish). Save it as a *.vbs file and just double-click to run. I also schedule it to run once a day. Dont forget to remove old backups, since this script doesnÃ¤t overwrite them. set WshShell = WScript.CreateObject(\"WScript.Shell\") WshShell.logevent 4, \"Starting backup of BOINC folder\" REM Exit BOINC ret = WshShell.AppActivate (\"BOINC Manager\" ) if ret = false then WshShell.logevent 1, \"Could not find BOIC to close it!\" WScript.quit -1 end if WScript.Sleep 1000 WshShell.SendKeys \"{F10}{k}{p}~\" WScript.Sleep 1000 WshShell.SendKeys \"{F10}{a}{a}~\" WScript.Sleep 10000 ret = WshShell.AppActivate (\"BOINC Manager\" ) if ret = true then WshShell.logevent 1, \"BOIC is still running after attempt to close it!\" WScript.quit -1 end if REM Backup BOINC MyTime = Time MyTime= Replace(MyTime, \":\", \"_\") BackupName = \"C:\\BOINC_Backups\\BOINC_Backup_\"& Date & \"_\" & MyTime BackupCommand = \"ntbackup backup c:\\program\\BOINC /J \"\"BOINC Backup\"\" /F \"\"\"+BackupName + \"\"\"\" rem MsgBox BackupCommand WshShell.Run BackupCommand,1,false WScript.Sleep 180000 ret = WshShell.AppActivate (\"SÃ¤kerhetskopiering\" ) i=0 while ret = true i = i+1 if i = 30 then WshShell.logevent 1, \"Ntbackup still running. Not restarting BOINC!\" WScript.quit -1 end if WScript.Sleep 30000 ret = WshShell.AppActivate (\"SÃ¤kerhetskopiering\" ) wend REM Restart BOINC REM set WshShell = WScript.CreateObject(\"WScript.Shell\") WScript.Sleep 2000 WshShell.Run \"boincmgr.exe\",1,false WScript.Sleep 10000 ret = WshShell.AppActivate (\"BOINC Manager\" ) if ret = false then WshShell.logevent 1, \"Could not restart BOINC!\" WScript.quit -1 end if WScript.Sleep 5000 REM run always WshShell.SendKeys \"{F10}{k}{k}~\" WScript.Sleep 5000 REM Turn off network access REM WshShell.SendKeys \"{F10}{k}{f}~\" REM WScript.Sleep 1000 WshShell.logevent 0, \"Successfully completed backup of BOINC folder\"
14) Message boards : Number crunching : Sulphur model phase 3 unchanged since phase 2 (Message 19499) Posted 21 Jan 2006 by staffann Post: Faulty batch. See this post (and some others in that thread) for more explanation. DaveF said they may be useful anyway for computer comparisons, but you certainly can abort it and get a new one if you so desire. Thank you. The thread refered to is a bit unclear on the on the topic of whether to abort the run or not... Well the runs are still useful (even if there is a measure of redundancy in them), so we don\'t want to insist people kill those runs, but if you want to make best use of your computer\'s processing time, you could kill the run and get a new experiment from Tolu\'s list (which he\'s generating right now). Maybe give it a couple of hours till the jobs are in place. This quote from DaveF indicates that it would be more efficient to kill the model, but is this still true after 3 phases have already been processed? Then there is also the issue of when the coupled model will be given a go on the main CPDM site. If that is soon, I may just as well wait for that rather than starting a new sulphur, and then finish this one in the mean time. Does anyone have info on that one?
15) Message boards : Number crunching : Sulphur model phase 3 unchanged since phase 2 (Message 19486) Posted 21 Jan 2006 by staffann Post: I have a sulphur model where the results from phases 2 and 3 seem identical. See http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=1208709 From what I understand the first 3 phases in sulphur should have about the same reaction as a normal slab, with increased CO2 in phase 3 usually leading to increased temperature. Is that correct? Is this then a faulty work unit? Should I abort the model?
16) Message boards : Number crunching : Database Sluggish -- Moved Trickle Info \'Down\' (Message 18569) Posted 21 Dec 2005 by staffann Post: A \"recent trickles\" page would indeed be very welcome, if possible to do without excess db load. The way it is now is a bit annoying in the long run.
17) Questions and Answers : Windows : cp PHASES (Message 17811) Posted 6 Dec 2005 by staffann Post: If you start the graphics, you will se a line \"Phase: 1 / Timestep: X of Y\". The Y number is the number of time steps in the current phase. Once X has reached Y, the model will go to the next phase. Give it some time, as you are running a complete climate model of the earth it will take some time to finish.
18) Questions and Answers : Windows : Can I restart incomplete WU\'s? (Message 17184) Posted 14 Nov 2005 by staffann Post: For people who run multiple projects, this can get messy. If you\'re in that situation, say so, and I\'ll post a how-to-do-it. It would be very interesting to know if I can restore cp without messing up work done in other projects. I had to restore a backup today, fortunatly with no major consequence for other projects. It would be good to know how to do it next time.
19) Questions and Answers : Wish list : Processor specific optimization? (Message 16979) Posted 4 Nov 2005 by staffann Post: The SSE and SSE2 optimizations are being used on the Intel processors, but the version of the Intel Fortran compiler used to compile hadsm3 disables SSE type optimizations if it sees other than an Intel CPU running the compiled program. The project now has a later version of the Intel Fortran compiler which does not hobble the AMD chips so much. Later versions of hadsm will no doubt have optimizations for AMD chips, but when this will happen is unknown. I hope optimisation for AMD processors happen soon! For seti@home my Athlon X2 3800+ got half the processing time for a WU when I downloaded optimised code. Crippling AMD chips this way is really not the way to get forwards!