climateprediction.net home page
Is my computer working uselessly?

Is my computer working uselessly?

Questions and Answers : Windows : Is my computer working uselessly?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
old_user519896

Send message
Joined: 28 May 08
Posts: 16
Credit: 32,985
RAC: 0
Message 33957 - Posted: 31 May 2008, 10:40:30 UTC

Hello.

I\'m new to BOINC. I registered on 2008-05-28 and I don\'t see anything change on my account (3 days later), my quota is stuck at 1 per day and I see messages like \"Output file hadsm3fub_jotf_005950294_7_2.zip for task hadsm3fub_jotf_005950294_7 absent\". Is everything ok or is there a problem and how can I solve it?

--
Frederic
ID: 33957 · Report as offensive     Reply Quote
Profile Pooh Bear 27
Avatar

Send message
Joined: 5 Feb 05
Posts: 465
Credit: 1,914,189
RAC: 0
Message 33958 - Posted: 31 May 2008, 10:55:31 UTC

Frederic

Your machine is crashing every result you have tried to crunch. The crashes are all the -107 crashes, which are usually hardware related.

I noticed you have a mobile CPU, is this a laptop? I also noticed it has a shared memory space for video, are you using the screen saver? Your machine has a light amount of memory for this project.

If you want to continue to do this projected, a few suggestions:
Do not use the BOINC / CPDN screen saver. Recommended use BLANK. Any of the screen savers on that machine are going to take up valuable resources and could cause crashes.

If a laptop, get it off the \"desk\". If you can use something to raise it a little and allow it to \"breath\" more, or better yet buy a laptop cooler. Make sure all the airflow spots are free of dust.

If you continue to crash models, then your may want to start doing some intensive testing of your machine. Like memtest86+ and Prime95 are a couple good testers.

I just noticed your speeds are low for that machine... it is having issues. Make sure you do some of the above cleaning / testing.

Good luck, and know that not all machines can handle CPDN.

ID: 33958 · Report as offensive     Reply Quote
old_user519896

Send message
Joined: 28 May 08
Posts: 16
Credit: 32,985
RAC: 0
Message 33959 - Posted: 31 May 2008, 12:23:42 UTC - in response to Message 33958.  

Thanks for your answer.

Your machine is crashing every result you have tried to crunch. The crashes are all the -107 crashes, which are usually hardware related.

Sorry, I don\'t see the crashes as such in the Messages pane. All I see are \"Started download ... / Finished download ... / Starting ... / Starting task ... / Computation for ... finished \" and then the \"Output ... absent\" messages. Nothing I recognize as error messages (except for \"absent\"). Is this as expected?

I noticed you have a mobile CPU, is this a laptop? I also noticed it has a shared memory space for video, are you using the screen saver? Your machine has a light amount of memory for this project.

If you want to continue to do this projected, a few suggestions:
Do not use the BOINC / CPDN screen saver. Recommended use BLANK. Any of the screen savers on that machine are going to take up valuable resources and could cause crashes.

Yes it is a laptop. Sorry, that\'s all I have to offer :-( I don\'t use the BOINC screen saver, though. I used the standard XP screen saver, but I just switched to the empty screen screen saver to see if it improves memory usage.

If a laptop, get it off the \"desk\". If you can use something to raise it a little and allow it to \"breath\" more, or better yet buy a laptop cooler. Make sure all the airflow spots are free of dust.

If you continue to crash models, then your may want to start doing some intensive testing of your machine. Like memtest86+ and Prime95 are a couple good testers.

I don\'t experience any crash. I own two laptops but I currently use this one for BOINC because the other one is having stability issues (probably OS-related) and I use this one to remotely access (TS client) a remote server, so it has more CPU time available. Since I use this laptop to access to my work, I can assure you that any crash would be immediately noticed!

I just noticed your speeds are low for that machine... it is having issues. Make sure you do some of the above cleaning / testing.

I am going to send the other laptop to be fixed. When it comes back, I reinstall the OS and switch to it.

Good luck, and know that not all machines can handle CPDN.

Thanks.


Frederic
ID: 33959 · Report as offensive     Reply Quote
Profile Pooh Bear 27
Avatar

Send message
Joined: 5 Feb 05
Posts: 465
Credit: 1,914,189
RAC: 0
Message 33960 - Posted: 31 May 2008, 13:43:20 UTC

Frederic,

Go to your account, and view the tasks. They all say they are \"Compute error\", which means the result crashed when working. These results on your computer should take a few months to finish, not the few minutes they are taking. You never get to a stage to even send back a trickle (a bit of data that shows the progress your result has done at a certain point in time). Your computer isn\'t crashing, but the result itself is.


ID: 33960 · Report as offensive     Reply Quote
old_user519896

Send message
Joined: 28 May 08
Posts: 16
Credit: 32,985
RAC: 0
Message 33961 - Posted: 31 May 2008, 14:20:45 UTC - in response to Message 33960.  

Thanks, it\'s much much clearer, now. I hope the aborted tasks are not lost. And I am a bit disappointed by the lack of hints about what is going wrong. I suppose there is nothing you can do about it, but it is frustrating when a software aborts without any information about why it did so. I checked the whole directory tree without finding anything. I found a stderr_um.txt, but it is empty :(

If tomorrow\'s job does not work properly, I\'m going to unsubscribe from climateprediction; consuming watts uselessly is definitely not a way to help climate!
Frederic
ID: 33961 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 33964 - Posted: 31 May 2008, 15:32:53 UTC
Last modified: 31 May 2008, 15:39:04 UTC

Bonjour Frédéric, bienvenu au forum et au projet.

Here is the detailed task page for one of your models:

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=7468399

If you click the + beside stderr out you\'ll see some details about what caused the crash. (These details only show when a model has finished computation, either crashed or completed.)

CPDN has 5 collections of README posts containing lots of useful advice. You can get to them by clicking on the link in my signature at the bottom of this post. I recommend you should look at the collection about Crashes and Problems. In that collection go to link #6 where MikeMars explains all the frequent causes of model crashes, including -107 code crashes.

Link #7 in the same problems collection is a post by Thyme Lawn who explains how to update graphics card drivers. This is a free download from the web like Windows updates. Updating these drivers often cures -107 errors, and it\'s good for the computer.

Hope that helps.

Mo
Cpdn news
ID: 33964 · Report as offensive     Reply Quote
old_user519896

Send message
Joined: 28 May 08
Posts: 16
Credit: 32,985
RAC: 0
Message 33965 - Posted: 31 May 2008, 19:18:31 UTC

Ah, at least I feel we are going somewhere! I like that!
There are a number of common errors which cause many people problems. The first is the Windows Stop message (appears as a Microsoft Send / Don\'t Send dialogue, and -1073741819 in the log)

I believe I have something like it: I found this
- exit code -1073741819 (0xc0000005)

in stderr.out. BUT I don\'t have any dialog. I am using XP SP3, maybe is it related.

Anyhow, this made me understand that my issue could be firewall- or rather kerio-related. I found here some infos about Kerio and I set my rules accordingly. I will check tomorrow if this solves the issue and report here.


Frederic
ID: 33965 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 33968 - Posted: 1 Jun 2008, 1:32:23 UTC

I think a graphics problem is more probable than your firewall. But it\'s also a very good idea to exclude the BOINC folder from your anti-virus scans if you do these scans while BOINC is running.
Cpdn news
ID: 33968 · Report as offensive     Reply Quote
old_user519896

Send message
Joined: 28 May 08
Posts: 16
Credit: 32,985
RAC: 0
Message 33970 - Posted: 1 Jun 2008, 9:30:45 UTC
Last modified: 1 Jun 2008, 9:33:15 UTC

Here is where I am, now:

- I used a memcheck86 and superpi (32M) without finding anything abnormal (I did not expect to find anything since I had not experienced any crashes on this machine). If my new task crashes too, I\'ll test Prime95, just to be sure.

- Since the hardware did not seem to be the culprit, I attacked the firewall. Here is what I did:

  • delete all BOINC related rules in the Firewall settings.
  • in the BOINC client, ask for a project update in BOINC in order to trigger the Kerio/Sunbelt popup, Kerio asks if access to the internet for this application is allowed.
  • Confirm than you accept all communications
  • Modify the rule for BOINC.exe, accept all communications for BOINC (all zones and all directions) and enable network logging (I\'ll remove logging once I\'m sure it is not a communications issue)



BTW, I made a mistake when I posted the link earlier, this is the correct link

My task has been running for more than 10 hours now, which is a little more than than the 6 hours my longest running time was until now, but of course it is much too early to say if the firewall changes did solve the problem.

mo.v, you made me think of something: I did not install the screen saver, but I did invoke manually the graphical display and play with it. I never experienced any obvious problem with it, but I don\'t know how it works, could it crash the engine but continue to show the rotating earth as if nothing had gone wrong?


Frederic
ID: 33970 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 33974 - Posted: 1 Jun 2008, 12:26:33 UTC
Last modified: 1 Jun 2008, 12:28:19 UTC

The graphical display only opens and shows your globe if the model\'s running. Using it and playing round with its display modes shouldn\'t cause problems, but members who\'ve had possibly graphics-related error codes like 107 and -107 should avoid maximising the graphics window. Keep the globe window small.

Have you updated your graphics card drivers?

Have a look at the README about backing up the BOINC folder. Les\'s manual method is easy and works well. With regular backups, if you do have another model crash, you could restore the backup and continue the same model. I back up my BOINC folders regularly even though my computers are running very stably at the moment. Because the models are so long, the probability that they will crash before they complete is quite high even on an exceptionally good computer.
Cpdn news
ID: 33974 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 33979 - Posted: 1 Jun 2008, 21:33:02 UTC
Last modified: 1 Jun 2008, 21:34:32 UTC

Frederic
You have a laptop. These use a few chips on the mainboard for the display, as well as some of the mainboard\'s memory.
When anything needing a large amount of memory for the display runs, such as a picture, larger amounts of memory are suddenly needed, and this can \"pull the carpet out from under the climate program\".

The program (and models), was originally developed for big supercomputers, so people running it on desktops need to have a reasonably well resourced computer to prevent problems.
Laptops are worse, because of their more limited memory, their lack of a separate graphics card, their smaller, \"lighter duty\" HDs, and their reduced cooling capabilities.

And the \"107...\" errors are Windows \"stop\" errors. (Look them up on the net, or in the Microsoft Knowledge Database.)
There seems to be 2 main reasons why they cause problems:
1) Shutting down a computer without first shutting down BOINC,
2) (More often), something to do with the display, usually something using a lot of memory, or an outdated (or generic MS), graphics driver, which isn\'t handling the requirements of the model\'s display very well.

These errors are not firewall related.
Backups: Here
ID: 33979 · Report as offensive     Reply Quote
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 33981 - Posted: 2 Jun 2008, 2:35:46 UTC - in response to Message 33965.  


I am using XP SP3, maybe is it related.

I am running fine on an XP SP3 machine.
ID: 33981 · Report as offensive     Reply Quote
old_user519896

Send message
Joined: 28 May 08
Posts: 16
Credit: 32,985
RAC: 0
Message 33986 - Posted: 2 Jun 2008, 16:06:44 UTC

Thanks for all your answers, everyone. Frankly, when I calculated the number of days which I would need to finish my task, I started thinking about quitting, but the help I am getting here makes me want to stay on board.

I like the graphics explanation better too, but I want to be sure. Now that I know there is a way to backup and recover from a crash, I am going to check if the interface is the culprit as soon as I have a little time.
Frederic
ID: 33986 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 33987 - Posted: 2 Jun 2008, 16:30:16 UTC
Last modified: 2 Jun 2008, 16:36:01 UTC

On projects where the workunits are short there isn\'t much time for a disaster to occur, and most computers complete most tasks. For example, if you play graphics-intensive games for 2 hours per day and your graphics card can\'t manage the games + the task simultaneously, you would probably still complete most short tasks successfully while you\'re not playing games and never realise that there\'s a problem. But with a climate model that lasts for months, every day that user could create a possible model crash situation.

Fortunately in the case of the HADCM 160 and 80-year models, they upload a decadal zip file to the server after every 10 years. The information contained in these files is used by the researchers even if the model later crashes.

In the case of the shorter HADAM and HADSM models I think they need to complete in order to be useful to the researchers.

CPDN is almost without doubt the most difficult project to learn to crunch successfully because of the length of the tasks. But we do get credits for every successful trickle even if the model later crashes.

I think most new CPDN members probably crash several models before they learn what to do and what not to do. I crashed a few.

Members who would prefer a shorter model next time they download one can select the model type in their CPDN preferences in their account. This may not be possible today because there are some server database problems at the moment.
Cpdn news
ID: 33987 · Report as offensive     Reply Quote
old_user519896

Send message
Joined: 28 May 08
Posts: 16
Credit: 32,985
RAC: 0
Message 33988 - Posted: 2 Jun 2008, 18:44:04 UTC

If I continue at my current rate, my next download should be in about 270 days (plus the unplanned but unavoidable delays). I guess I have time enough to choose an easier job for next time :-D
Frederic
ID: 33988 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 33990 - Posted: 2 Jun 2008, 21:20:10 UTC

The choice of models may then be different.
Cpdn news
ID: 33990 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,107,227
RAC: 2,727
Message 33991 - Posted: 3 Jun 2008, 0:57:46 UTC - in response to Message 33986.  

Thanks for all your answers, everyone. Frankly, when I calculated the number of days which I would need to finish my task, I started thinking about quitting, but the help I am getting here makes me want to stay on board.

I like the graphics explanation better too, but I want to be sure. Now that I know there is a way to backup and recover from a crash, I am going to check if the interface is the culprit as soon as I have a little time.


Hi, I am glad to see that you are continuing with the project. Backups are vital. I am running the 3 coupled models on 2 laptops and I can tell you that I never get through one without restoring it half a dozen times! Making backups is easy and once you have done it a few times you can do it in your sleep. I make one every morning (that way I only loose a few hours crunching time). It only takes about 5 minutes. If you only make backup like once a week the model will proudly crash on day 6 and you have to repeat the whole week. Happy crunching.



Jim

ID: 33991 · Report as offensive     Reply Quote
old_user519896

Send message
Joined: 28 May 08
Posts: 16
Credit: 32,985
RAC: 0
Message 33993 - Posted: 3 Jun 2008, 11:23:25 UTC

Thanks for the idea, JIM. To make things easier, I just created a small batch file which suspends, backups and restarts the project. Now that I know how to do it, I will create another to suspend and backup BOINC before shutting off the computer. I believe I saw tools that allow you to launch tasks before closing Windows.
Frederic
ID: 33993 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 33997 - Posted: 3 Jun 2008, 18:30:45 UTC

Hi again

Make sure your batch file also exits from BOINC, which is what you do manually by File > Exit in BOINC manager, or by right-clicking on the system tray fried-egg-on-grill icon then selecting Exit.

If you make a backup without exiting from BOINC first, there\'s no real guarantee that it will restore successfully and the tasks run.
Cpdn news
ID: 33997 · Report as offensive     Reply Quote
old_user519896

Send message
Joined: 28 May 08
Posts: 16
Credit: 32,985
RAC: 0
Message 34000 - Posted: 3 Jun 2008, 19:47:14 UTC

Funny, I just found this out! I was using 7zip (command line version of course) to compress and I saw that 7zip complained that some files were still locked (although in the BOINC GUI the project is suspended). I believe there are some zip switches which could solve the lock issue, but I\'d rather compress when all locks are removed.

So is the procedure Suspend-then-Exit completely ok? Is it any use to do
  boinccmd --project http://climateprediction.net suspend
  boinccmd --quit
  (zip)
  boincmgr

or is --quit enough

... or is there something else required?
Frederic
ID: 34000 · Report as offensive     Reply Quote
1 · 2 · Next

Questions and Answers : Windows : Is my computer working uselessly?

©2024 climateprediction.net