climateprediction.net home page
Can´t compute WU without error

Can´t compute WU without error

Questions and Answers : Windows : Can´t compute WU without error
Message board moderation

To post messages, you must log in.

AuthorMessage
Thorvin

Send message
Joined: 30 Aug 04
Posts: 2
Credit: 403,463
RAC: 0
Message 14456 - Posted: 16 Jul 2005, 20:17:29 UTC

Hello,

I´m very new to Climatprediction.net, but until now I´ve never managed to complete a WU.

On 2 Computers (Win 2000 and Win XP) there´s always a Computing error after about 1.0 % finished an the Boinc manager gets a new Workunit.
Anybody had this problem befor ?
Any Help is appreciated, Thanks in advance.

greetings
Markus
ID: 14456 · Report as offensive     Reply Quote
Profile old_user2491

Send message
Joined: 28 Aug 04
Posts: 117
Credit: 21,096
RAC: 0
Message 14457 - Posted: 16 Jul 2005, 21:06:39 UTC

Unrecoverable error for result '(result)' ( - exit code -5 (0xffffffb))
From BOINCWiki
Table of contents [showhide]
1 General

1.1 Climateprediction.net


2 Version Information

3 Example Log

3.1 Crash!
3.2 Permanently Failed Upload of File


4 Other Related Messages

[edit]General
Message Type: Error Message

This message means that the BOINC Client Software has detected a failure of the Science Application which has occurred because of an internal error (it "crashed"). These errors may be the result of a hardware error, but are more likely because of a problem with the Science Application software and the Work Unit that it is trying to process. These errors should be reported to the Project so that they can try to find the problem in the software.

[edit]Climateprediction.net
While we know about these crashes in the Climateprediction.net's Science Application occur in both the Classic and the Versions Powered by BOINC, they tend to be machine specific. Occasional crashes may indeed relate to the particular Work Unit, but experienced Participants know they are comparatively rare.

Hardware is one of the more common causes of failure in Climateprediction.net because it stresses machines far more than most things, and doubters have often come back later after having tracked down a fault. The rest of these causes are elusive, but the occasional success in finding the cause, as with Anti-Virus software or driver conflicts, show that it is well worth looking for these other causes.

[edit]Version Information
None.
[edit]Example Log
[edit]Crash!
06-06-2005 22:02:53|climateprediction.net|Restarting result 38zl_200173143_1 using hadsm3 version 4.12
-+-+/\/\/\/\/\ Log Break /\/\/\/\/\-+-+-
06-06-2005 22:34:08|climateprediction.net|Unrecoverable error for result 38zl_200173143_1 ( - exit code -5 (0xfffffffb))
-+-+/\/\/\/\/\ Log Break /\/\/\/\/\-+-+-
06-06-2005 22:34:08|climateprediction.net|Computation for result 38zl_200173143_1 finished
Line-By-Line Explanation
Restarting result '(result)' using '(science-application)' version '(version)'
And now it has restarted processing. There is little difference between "Resuming" and "Restarting", "Resuming" occurs if the Result's information is still resident in memory and the "Restarting" indicates a reload of the in-progress information.
Unrecoverable error for result '(result)' ( - exit code -5 (0xffffffb))
The Science Application has "crashed" and stopped processing the Result/Work Unit.
Computation for result '(result)' finished
Because of the "crash" the current Result has been halted. This Result, when the Reporting Process is complete will be recorded with an Outcome of "Client Error".
[edit]Permanently Failed Upload of File
One of the minor interesting items in this message display is the fact that you can see that two upload threads are run and because they fail at different rates, the messages become interleaved. In a specific example, fine #1 and #2 are started, #1 fails and #3 is started before #2 fails. Hey, sometimes I get bored with this stuff and wonder if anyone reads any of it at all.

2005-06-26 23:44:19 [climateprediction.net] Restarting result 034t_000008942_0 using hadsm3 version 4.12
-+-+/\/\/\/\/\ Log Break /\/\/\/\/\-+-+-
2005-06-26 23:45:06 [climateprediction.net] Unrecoverable error for result 034t_000008942_0 ( - exit code -5 (0xfffffffb))
2005-06-26 23:45:06 [---] request_reschedule_cpus: process exited
2005-06-26 23:45:06 [climateprediction.net] Computation for result 034t_000008942_0 finished
-+-+/\/\/\/\/\ Log Break /\/\/\/\/\-+-+-
2005-06-26 23:45:07 [climateprediction.net] Started upload of 034t_000008942_0_1.zip
2005-06-26 23:45:07 [climateprediction.net] Started upload of 034t_000008942_0_2.zip
-+-+/\/\/\/\/\ Log Break /\/\/\/\/\-+-+-
2005-06-26 23:45:12 [climateprediction.net] Error on file upload: invalid signature
2005-06-26 23:45:12 [climateprediction.net] Permanently failed upload of 034t_000008942_0_1.zip
2005-06-26 23:45:12 [climateprediction.net] Giving up on upload of 034t_000008942_0_1.zip: server rejected file
2005-06-26 23:45:12 [climateprediction.net] Started upload of 034t_000008942_0_3.zip
-+-+/\/\/\/\/\ Log Break /\/\/\/\/\-+-+-
2005-06-26 23:45:20 [climateprediction.net] Error on file upload: invalid signature
2005-06-26 23:45:20 [climateprediction.net] Permanently failed upload of 034t_000008942_0_2.zip
2005-06-26 23:45:20 [climateprediction.net] Giving up on upload of 034t_000008942_0_2.zip: server rejected file
2005-06-26 23:45:20 [climateprediction.net] Started upload of 034t_000008942_0_4.zip
2005-06-26 23:45:22 [climateprediction.net] Error on file upload: invalid signature
2005-06-26 23:45:22 [climateprediction.net] Permanently failed upload of 034t_000008942_0_3.zip
2005-06-26 23:45:22 [climateprediction.net] Giving up on upload of 034t_000008942_0_3.zip: server rejected file
2005-06-26 23:45:22 [climateprediction.net] Started upload of 034t_000008942_0_5.zip
-+-+/\/\/\/\/\ Log Break /\/\/\/\/\-+-+-
2005-06-26 23:45:32 [climateprediction.net] Error on file upload: invalid signature
-+-+/\/\/\/\/\ Log Break /\/\/\/\/\-+-+-
2005-06-26 23:45:33 [climateprediction.net] Permanently failed upload of 034t_000008942_0_5.zip
2005-06-26 23:45:33 [climateprediction.net] Giving up on upload of 034t_000008942_0_5.zip: server rejected file
2005-06-26 23:45:33 [climateprediction.net] Error on file upload: invalid signature
2005-06-26 23:45:33 [climateprediction.net] Permanently failed upload of 034t_000008942_0_4.zip
2005-06-26 23:45:33 [climateprediction.net] Giving up on upload of 034t_000008942_0_4.zip: server rejected file
Line-By-Line Explanation
Restarting result '(result)' using '(science-application)' version '(version)'
We restart a Result that has already been partially processed.
Unrecoverable error for result '(result)' ( - exit code -5 (0xffffffb))
The Science Application has "crashed" and stopped processing the Result/Work Unit.
request_reschedule_cpus: process exited
Because the Science Application ended processing (it "crashed"), we need to start up another Result.
Computation for result '(result)' finished
Because of the "crash" the current Result has been halted. This Result, when the Reporting Process is complete will be recorded with an Outcome of "Client Error".
Started upload of '(file)'
We start to upload two of the 5 Result Data Files.
We will continue to attempt to upload the other Result Data Files in turn, each will fail upload with the following sequence of messages ...
Error on file upload: invalid signature
The Signature of the Result Data File is incorrect.
Permanently failed upload of '(file)'
Because of the invalid signature, this Result Data File cannot be accepted by the Data Server.
Giving up on upload of '(file)': server rejected file
The Data Server rejected the file because of the invalid Signature and we cannot assign a different signature, so abandon the attempt to upload the file.

<a href="http://boinc-doc.net/boinc-wiki/index.php?title=Main_Page">Link to Unoffical Wiki for BOINC, by Paul and Friends</a>
ID: 14457 · Report as offensive     Reply Quote
Profile Ananas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 14460 - Posted: 16 Jul 2005, 21:36:50 UTC
Last modified: 16 Jul 2005, 21:40:50 UTC

Afaik. "Access denied" (your error -5) can be caused by two things :

- you're running a virus scanner that doesn't work well with CPDN and/or BOINC (FreeAV?). If you can, exclude all of BOINC from beeing scanned.

- you installed CPDN with an admin user and the permissions of your BOINC user do not allow to update some files.


I heard that Defrag can have bad influence on BOINC too so if you do a defrag sometimes, you could shutdown BOINC before you do that

BOINC is very picky with files that are open from another program, even if it's just for reading. Maybe this gives some hint what to watch out for, good luck!
ID: 14460 · Report as offensive     Reply Quote
Thorvin

Send message
Joined: 30 Aug 04
Posts: 2
Credit: 403,463
RAC: 0
Message 14486 - Posted: 18 Jul 2005, 9:00:47 UTC

Hello,

I´ll try to avoid scanning the boinc files eith the virus-scanner, thanks for this idea.

Here´re some error messages:

2005-07-15 02:10:35 [climateprediction.net] Unrecoverable error for result 3ep4_200180615_1 ( - exit code -1073741819 (0xc0000005))
[...]
2005-07-15 19:56:24 [climateprediction.net] Unrecoverable error for result 0cgk_100041246_1 ( - exit code -1073741819 (0xc0000005))
[...]
2005-07-18 10:45:05 [climateprediction.net] Unrecoverable error for result 0ctg_100041713_1 (Es gibt keine untergeordneten Prozesse, auf die gewartet werden muss. (0x80) - exit code 128 (0x80))
[...]
2005-07-13 21:35:07 [Einstein@Home] Unrecoverable error for result w1_0312.0__0312.0_0.1_T03_S4hA_3 ( - exit code -164 (0xffffff5c))

greetings Thorvin
ID: 14486 · Report as offensive     Reply Quote
old_user1
Avatar

Send message
Joined: 5 Aug 04
Posts: 907
Credit: 299,864
RAC: 0
Message 14487 - Posted: 18 Jul 2005, 9:13:04 UTC - in response to Message 14486.  

if you (or anyone else) finds that the virus scanner is interrupting, please let us know. It seems that the climate model does not seem to like "co-existing" with some virus scanners, i.e. ones that try to access data files when the model (Fortran code) wants to open, etc. If you are able to change the virus scanning to open search through executable files, that may help.
ID: 14487 · Report as offensive     Reply Quote
Profile Ananas
Volunteer moderator

Send message
Joined: 31 Oct 04
Posts: 336
Credit: 3,316,482
RAC: 0
Message 14489 - Posted: 18 Jul 2005, 9:17:25 UTC
Last modified: 18 Jul 2005, 9:23:21 UTC

The error message means something like "There are no unwaited-for child processes", it means that the project client has died when BOINC still expected it to be there. Unfortunately it doesn't say <i>why</i> the project client has crashed or been killed.

If your computer is overclocked, you could try to reduce speed a little. As it's sure quite warm in your area, you could watch the CPU temperature too. If your motherboard didn't come with a program that does that, you could use <a href="http://mbm.livewiredev.com/">Motherboard Monitor</a> for this check. Maybe too many dustbunnies in your box. I currently run some of my faster boxes with open side doors as it's quite warm in my room.

If you're using FreeAV, you could try <a href="http://www.bitdefender.de">BitDefender</a> instead. Or maybe run without a scanner for awhile - if you're aware of the higher risk and do not use Internet Explorer or Outlook, this shouldn't cause trouble and it might help locate the problem.
___________

One thing caused trouble on one of my Win2000 computers once, it was the power savings setting of the motherboard (this ACPI stuff). It is not very likely that this is your problem too but it isn't impossible of course.
___________

edit :You could see if you have some log and dmp file here :

C:/Dokumente und Einstellungen/All Users/Dokumente/DrWatson

if there is something, clean the directory (delete stuff or move it somewhere else) and then retry BOINC.

After the crash, it might have created drwtsn32.log again with some informations about the crash.
ID: 14489 · Report as offensive     Reply Quote

Questions and Answers : Windows : Can´t compute WU without error

©2024 climateprediction.net