Questions and Answers :
Windows :
COmputation error
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Feb 06 Posts: 4 Credit: 76,344 RAC: 0 |
Hi, I\'ve been running two climate simulations side-by-side on my quad-core CPU, both had each been running for 350 hours and were 18.5% completed (with ~1200 hrs left to completion). However, after a reboot (due to a Windows Update) one of them showed as \"Computation error\". I closed BOINC and reloaded it and it now says \"Ready To Report\". Obviously something went wrong. Is there anything I can do, given the amount of time I\'ve invested in it? The specific task is: hadcm3istd_1414_1920_160_15995109_4 I\'ve closed BOINC to stop it from being uploaded (given that it\'s current state is Ready to Report) in case I can somehow salvage and re-continue it. Thanks, Tim |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Models can only be restarted from a backup of the entire BOINC folder (version5), or the entire BOINC data folder (version 6), made BEFORE the failure. See my sig for a link to details. The failure was probably due to restarting the computer without first stopping BOINC. Backups: Here |
Send message Joined: 18 Feb 06 Posts: 4 Credit: 76,344 RAC: 0 |
Ok, so that particular simulation is now unrecoverable? No backups are automatically generated? I\'m actually quite particular about BOINC, and closed it down (fully exited the program, as confirmed by the systray icon disappearing) some 20 seconds prior to selecting to reboot Windows. I always monitor my CPU temperatures and they dropped, as usual, confirming BOINC was closed. However, when the computer rebooted it hung before booting into Windows - on a black screen. After leaving it five minutes in case it was doing something behind the scenes, I eventually had to hit the Reset button. Nothing else was functioning. The reboot worked this time, but then the computation error was shown when BOINC was restarted. Looks like I found out about the need for making manual backups too late. =/ Thanks, Tim |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
It would be very nice if BOINC could back its own project data up regularly. The problem is that you need to have exited completely from BOINC before the backup is made, otherwise files might be in use and the backup wouldn\'t be restorable. Once you\'ve stopped BOINC it can\'t do anything for itself, for the models or for you. Have a look at the backup methods in the README and choose the one that suits you best. With a fast quad like yours it would probably be a good idea to back up every couple of days. So if you ever did need to restore a backup to rescue a crashed model you wouldn\'t need to repeat much crunching. Cpdn news |
Send message Joined: 18 Feb 06 Posts: 4 Credit: 76,344 RAC: 0 |
Indeed. A quick Rar backup of the whole folder only takes little over 60 seconds with half a Gig of data in there, after everything\'s closed/cleared. Very easy to do. Wish I\'d somehow been pre-warned beforehand, I\'d have taken one every day! Thanks. :) |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
We\'ve been urging people to make backups for years. The problem is to get people to read what\'s posted. Backups: Here |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
TÃÂmo, go to the links in Les\'s signature and mine. They will take you straight to the most useful CPDN information pages and forum threads. Cpdn news |
Send message Joined: 18 Feb 06 Posts: 4 Credit: 76,344 RAC: 0 |
We\'ve been urging people to make backups for years. The problem is to get people to read what\'s posted. Actually I ran BOINC standalone, and had never been on any BOINC forums (never saw the need to) until I got the error, so would never have known. I googled to find an answer, and it was only that that brought me here. TÃÂmo, go to the links in Les\'s signature and mine. They will take you straight to the most useful CPDN information pages and forum threads. Thnx. :) |
Send message Joined: 23 Aug 09 Posts: 5 Credit: 730,885 RAC: 279 |
I\'ve gotten \"computation error\" as the result of 3 out of 4 of the completed results without any sort of update. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi Matt This 22 error you\'ve been getting is a nuisance because it covers so many different possible things that may have gone wrong. I can only suggest backing up your Boinc Data folder regularly in case the same thing happens again and restoring it if the task crashes. The server will accept whatever new trickles and files you upload. The longer a task lasts the more useful it is to do this. Have a look at the README collection about problems to see if there\'s anything you should be doing and aren\'t. There\'s a link in my signature. There\'s also a README about how to back up and restore if you haven\'t done it before. Les\'s quick manual method works perfectly. Cpdn news |
Send message Joined: 23 Aug 09 Posts: 5 Credit: 730,885 RAC: 279 |
The problem is that I don\'t usually even know until it\'s already been reported as such so backing it up might not help. I have Boinc running when I\'m online, playing games and even when I\'m away from my computer for a little so I might not check up on it for hours at a time. Apparently as it uploads every year of data at least and I still get points for these it is still contributing but I\'d just like to see more complete cycles. It does seem to either get the error within a week of running - as the one that completed was going since I\'ve been on here and only finished maybe 2 weeks ago (though the deadline was sometime in late summer 2010!) I\'m pretty bad with computer stuff so a lot of that is beyond me. edit: I read the links and couldn\'t make head or tail of it. Way too technical for me. Note I use IE because I was completely unable to figure out how to use any other browser. I can\'t find a BOINC or climate prediction folder anywhere. |
Send message Joined: 31 Dec 07 Posts: 1152 Credit: 22,122,158 RAC: 2,289 |
The problem is that I don\'t usually even know until it\'s already been reported as such so backing it up might not help. I have Boinc running when I\'m online, playing games and even when I\'m away from my computer for a little so I might not check up on it for hours at a time. Apparently as it uploads every year of data at least and I still get points for these it is still contributing but I\'d just like to see more complete cycles. It does seem to either get the error within a week of running - as the one that completed was going since I\'ve been on here and only finished maybe 2 weeks ago (though the deadline was sometime in late summer 2010!) I\'m pretty bad with computer stuff so a lot of that is beyond me. edit: I read the links and couldn\'t make head or tail of it. Way too technical for me. Note I use IE because I was completely unable to figure out how to use any other browser. I can\'t find a BOINC or climate prediction folder anywhere. Hi, Matt: The boinc folder is in the programdata folder. The reason why you are unable to find it is that Windows hides this folder by default. To make it visible you need to click on “control panel†and then click on “folder optionsâ€. Once “folder options†is open click the “view†tab. Scroll down to “Hidden files an folder†and click “show hidden files, folders, and drives.†Then click “OKâ€. This will make it visible. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi Matt You can see what happened to your past models by clicking on your own name on the forum then following the links to your computer and its tasks. One of the crashed models is here. If you click on + you\'ll see messages about how the model progressed. Unfortunately we can only see these messages once a model\'s finished, completed or crashed, so we can\'t see whether anything\'s going wrong as it happens. I don\'t think you need to do anything very complicated. Try this first. I think your models may be crashing because you turn off the computer without exiting from Boinc first, and it sounds as if you turn off the computer every day. Mostly this does no harm but sooner or later when you turn off the computer the model will be caught at a moment when it\'s trying to record data on your disk and it will crash. This is a nuisance. As there are two ways to exit from Boinc we need to know whether you have it installed as a \'service\' or not. Like many people you may not know. Could you please copy for us the first 20 or so lines of your messages in your Boinc manager. Press the Shift key, click on the first then the last line you want to show us. They\'ll all be highlighted. Click \'Copy selected messages\'. Then in a post here go to Page - Paste (or File - Paste) and they should appear in your post. We should then be able to tell you exactly how to exit from Boinc. (A Boinc exit is quick and easy; we just need to tell you the method for your Boinc installation.) Cpdn news |
Send message Joined: 23 Aug 09 Posts: 5 Credit: 730,885 RAC: 279 |
Here they are: 10/13/2009 8:36:55 AM||Starting BOINC client version 6.2.28 for windows_intelx86 10/13/2009 8:36:55 AM||log flags: task, file_xfer, sched_ops 10/13/2009 8:36:55 AM||Libraries: libcurl/7.19.0 OpenSSL/0.9.8i zlib/1.2.3 10/13/2009 8:36:55 AM||Data directory: C:\\Documents and Settings\\All Users\\Application Data\\BOINC 10/13/2009 8:36:55 AM||Running under account Owner 10/13/2009 8:36:58 AM||Processor: 2 GenuineIntel Intel(R) Pentium(R) 4 CPU 3.40GHz [x86 Family 15 Model 3 Stepping 4] 10/13/2009 8:36:58 AM||Processor features: fpu tsc sse sse2 mmx 10/13/2009 8:36:58 AM||OS: Microsoft Windows XP: Home x86 Editon, Service Pack 3, (05.01.2600.00) 10/13/2009 8:36:58 AM||Memory: 1021.78 MB physical, 2.40 GB virtual 10/13/2009 8:36:58 AM||Disk: 232.88 GB total, 90.41 GB free 10/13/2009 8:36:58 AM||Local time is UTC -4 hours 10/13/2009 8:36:58 AM|climateprediction.net|URL: http://climateprediction.net/; Computer ID: 1001743; location: (none); project prefs: default 10/13/2009 8:36:58 AM|lhcathome|URL: http://lhcathome.cern.ch/lhcathome/; Computer ID: 9811710; location: (none); project prefs: default 10/13/2009 8:36:58 AM|Milkyway@home|URL: http://milkyway.cs.rpi.edu/milkyway/; Computer ID: 104520; location: (none); project prefs: default 10/13/2009 8:36:58 AM|Cosmology@Home|URL: http://www.cosmologyathome.org/; Computer ID: 60243; location: (none); project prefs: default 10/13/2009 8:36:58 AM|World Community Grid|URL: http://www.worldcommunitygrid.org/; Computer ID: 1030424; location: (none); project prefs: default 10/13/2009 8:36:58 AM||General prefs: from World Community Grid (last modified 31-Dec-1969 19:00:01) 10/13/2009 8:36:58 AM||Host location: none 10/13/2009 8:36:58 AM||General prefs: using your defaults 10/13/2009 8:36:58 AM||Reading preferences override file 10/13/2009 8:36:58 AM||Preferences limit memory usage when active to 510.89MB 10/13/2009 8:36:58 AM||Preferences limit memory usage when idle to 766.33MB 10/13/2009 8:36:58 AM||Preferences limit disk usage to 18.63GB 10/13/2009 8:38:56 AM|Cosmology@Home|Restarting task wu_100309_230540_1_1_0 using camb version 216 |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi Matt Thanks for the messages. Your Boinc isn\'t running as a service, otherwise it would say \'Running as a daemon\'. Exiting from Boinc is easy. Right-click on the Boinc icon at the bottom of the screen (it\'s like a fried egg on a grill) and click Exit. The icon will disappear. Now it\'s safe for the model to shut down the computer. When you restart the computer the Boinc icon should reappear. If it ever doesn\'t just go to Start > Programs > Boinc > click on Boinc manager. The icon should jump into place. Get used to doing that. Later if you want to try making a backup (eg when you have a bit of spare time say at the weekend) let us know and I\'ll explain an easy method click-by-click. Cpdn news |
Send message Joined: 23 Aug 09 Posts: 5 Credit: 730,885 RAC: 279 |
OK that sounds pretty easy to me. Just have to remember to do it - not used to it and I turn off my computer at bedtime and I\'m often a bit drunk then. I\'m guessing you don\'t have to shut off BOINC if you go into stnadby mode without actually turning the computer off - I usually do that when I\'m eating or away from my computer for whatever reason more than half an hour or so. |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
... I\'m guessing you don\'t have to shut off BOINC if you go into stnadby mode without actually turning the computer off ... It would be a good idea to stop BOINC in that situation, though you would have to remember to start it again when the computer comes out of hibernation. Or you could abandon hibernation and just close down as usual at the end of the day - the model will finish earlier! Though BOINC is designed to cope with lots of these kinds of situations - starting, stopping, busy computers etc. - it is nonetheless the case that it\'s exactly at these times when model errors tend to happen. If you want to complete more models then the models will have to be protected a bit (or backed up). I don\'t like making backups so I make one at the beginning of a run and take care not to crash the model before it\'s finished; other people take lots of backups and thrash their computers, restoring the backup if the model crashes. It just depends what each person wants to do. |
Send message Joined: 15 Oct 09 Posts: 1 Credit: 381,051 RAC: 0 |
I am getting this error around 35% to 40% into the progressed time myself. A fortran error. It looks like I might have to learn to make backups also, if that will help with this error. And no, I do not turn off my computer without exiting Boinc first. |
Send message Joined: 7 Aug 04 Posts: 2169 Credit: 64,555,907 RAC: 5,858 |
Moved this thread from the \"climateprediction.net Science\" forum to the Windows one as the people reporting problems are running in that OS. The problems are not related to climateprediction.net science. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi Quinsha The visual Fortran runtime error in a popup window is a nuisance because we don\'t know the cause. It says Fortran because that\'s the program the models are written in. If the error message appears repeatedly you could * make backups in case the error crashes the model, though often the model continues successfully * exclude Boinc from anti-virus scans * completely exit from Boinc then restart it * exit from Boinc and reboot the computer * before all the climate models on the problem computer have finished set CPDN to No new tasks, then when all the models have finished and reported reset CPDN. But sometimes the Fortran runtime error just stops appearing for no apparent reason. This happened to me. I don\'t know why the error message appeared a few times or why it then disappeared. Cpdn news |
©2024 climateprediction.net