Lasse Hintze Třndering-Jensen
Joined: 18 Jun 14
Having run Boinc CP tasks for a while on about 10 computers having windows 7, 10 and server 2012, I have just recently noted that a huge part, 50-100% of the tasks, ends in â€śError while computingâ€ť. What I up with that?
Example I have a 48 processor Intel Xeon CPU E5-2697 having run 31 tasks (4 is in progress), in about a week, where 26 of them ended in â€śError while computingâ€ť. Only one is completed.
On another 24 processor Intel Xeon CPU E5-2620, having run 12 tasks in about a week, all ending with â€śError while computingâ€ť.
Strangely enough I do get some credit, for what I translate as incomplete tasks? So questions; why does this project have this insane amount of computing errors, compared to other projects? No other project I participate in is even remotely in this neighborhood of errors. And second seeing that you do get credits for errors, do you get enough for the amount of processing power you put into it? or does a huge part end in the bin?
Iâ€™m seriously considering dropping CP all together, because I donâ€™t want to waste CPU power on constantly incomplete tasks.
Joined: 15 May 09
A number of questions which might give hints. - I get very few errors but my boxes are only 2/4 core machines.
1. Are BOINC working and data directories excluded from antivirus scans? If an anti-virus program has a lock on a file when BOINC wants to write to it, it will cause a crash.
2. Are the machines overclocked? This makes crashes more likely.
3. How fast are disks compared to throughput of work. I recall a previous thread where disk writes not being fast enough for the amount of work being done by so many cores was suggested as a problem.
4.Do you regularly have non-BOINC things happening on the computer which quickly take cpu time away from BOINC? CPDN does seem more prone to problems due to this.
5 Do you have Leave non gpu tasks in memory when suspended ticked? If not ticking it seems to greatly reduce the number of errors.
I am sure there are a couple more that I have missed out but brain isn't really in gear yet.
Joined: 5 Sep 04
The only other one that I can think of that's general, is Suspend work if CPU usage is above
Unless this is either turned off or set to maximum, (not sure what it's called these days), the task will keep being turned on and off every time that your use causes it to exceed the limit.
You can get away with this for a while, but sooner or later you'll catch the program at a critical point, and it'll fail.
Some ideas for improving the chances of a task surviving are here.
I think that in the BOINC versions since then, the Suspend work if CPU usage is above setting needs to be 100%, not zero, to turn it off.