climateprediction.net home page
Posts by Dave Jackson

Posts by Dave Jackson

21) Message boards : Number crunching : Batch 1008, and test batches 1009 to 1014 for Windows - issues (Message 70750)
Posted 20 days ago by Profile Dave Jackson
Post:
Starting to get some tasks from batch 1009 - I assume these are the test run.
I can confirm these are from the test batch of 100 tasks.

Edit: And I would guess they have all gone now so I won't get any unless there are failures.
22) Message boards : Number crunching : Batch 1008, and test batches 1009 to 1014 for Windows - issues (Message 70748)
Posted 20 days ago by Profile Dave Jackson
Post:
I have deleted the last resend. It was a _2 so won't be sent again now. I have left the five started tasks from 1008 going and there is a resend from 1007 at 88%. I have also set the machine to no new tasks till I get some hints about the imminentness of the 100 tasks being released.

Edit:I think if BOINC were to cater for this type of test it would almost certainly mess something else up!

Edit2: Given the time I would not be surprised if the test doesn't arrive till Monday though I have been caught out before by batches being released over the weekend.
23) Message boards : Number crunching : Batch 1008, and test batches 1009 to 1014 for Windows - issues (Message 70746)
Posted 20 days ago by Profile Dave Jackson
Post:
I believe the Intel runs are behaving correctly and failing. It's the AMD runs not behaving.
The only reason I haven't asked why is I almost certainly will not understand the answer! ;)
24) Message boards : Number crunching : Batch 1008, and test batches 1009 to 1014 for Windows - issues (Message 70744)
Posted 20 days ago by Profile Dave Jackson
Post:
Thanks Glen. I will abort the two not started yet as credit isn't a issue for me.
I was clearly a bit premature with that as I have picked up one more resend from 1008.
25) Message boards : Number crunching : Batch 1008, and test batches 1009 to 1014 for Windows - issues (Message 70742)
Posted 20 days ago by Profile Dave Jackson
Post:
Thanks Glen. I will abort the two not started yet as credit isn't a issue for me.
26) Message boards : Number crunching : Batch 1008, and test batches 1009 to 1014 for Windows - issues (Message 70738)
Posted 21 days ago by Profile Dave Jackson
Post:
I believe the Intel runs are behaving correctly and failing. It's the AMD runs not behaving.
Should I just abort the two that are yet to start? I have five others that I can save files from that have all produced either 4 or 5 zips. Or would looking at what happens at the point where they fail on Intel machines be more useful?
27) Message boards : Number crunching : Batch 1008, and test batches 1009 to 1014 for Windows - issues (Message 70734)
Posted 21 days ago by Profile Dave Jackson
Post:
I would suggest that those with Intel processors set CPDN to no new tasks till this is sorted.

Edit: It is possible the batch might be closed which would stop resends and let those with work on AMD machines complete it.

Edit: I think it is being paused which will stop resends. I have looked at over 20 hard fails, every single one is at the same point on an Intel machine. I have seven from the batch on my machine, Four have produced 5zips and trickle up messages, one four and two waiting to start. It is most odd.
28) Message boards : Number crunching : Batch 1008, and test batches 1009 to 1014 for Windows - issues (Message 70727)
Posted 22 days ago by Profile Dave Jackson
Post:
Yes, phenom2, all Ryzen and thread ripper CPUs support SSE4.2
29) Message boards : Number crunching : New Work Announcements 2024 (Message 70717)
Posted 22 days ago by Profile Dave Jackson
Post:
Please! Keep this thread for announcements about new work. Any discussion should go in new thread(s.)
30) Message boards : Number crunching : Batch 1008, and test batches 1009 to 1014 for Windows - issues (Message 70714)
Posted 22 days ago by Profile Dave Jackson
Post:
Dave, you might recall your dev test did fail and that was on AMD.
And that one completed for Richard on an Intel machine
31) Message boards : Number crunching : Batch 1008, and test batches 1009 to 1014 for Windows - issues (Message 70710)
Posted 22 days ago by Profile Dave Jackson
Post:
The only pattern I've noticed (if it is a pattern), is that my failures were on a Win10 VM running on a intel chip, whereas the same VM running on a AMD has got 3 tasks past 1/Jan.


Any idea of the percentage of Intel vs AMD chips. I have been trawling and every single failure I have looked at has been Intel but, the overwhelming majority of tasks have not returned a zip yet so there is no evidence they are running correctly. Mine which have returned zips are all Wind10 in a VM as opposed to WINE which might mask failures. (All on AMD Ryzen 7 3700X )

I guess we might have more data by tomorrow morning when most computers running 24/7 should have either failed tasks or produced zips.
32) Message boards : Number crunching : Should full credit be given for time on non successful tasks? (Message 70687)
Posted 24 days ago by Profile Dave Jackson
Post:
I looked around a bit an several projects seem to have had people with this issue. If for any reason the benchmark for your machine is optimistic it can cause this error. Manually rerunning benchmarks should solve it for future tasks but not those already downloaded.
33) Message boards : Number crunching : Should full credit be given for time on non successful tasks? (Message 70680)
Posted 29 days ago by Profile Dave Jackson
Post:
It is the way over half the credit is quite often given to two or more in the case of some Linux tasks before someone finally completes the work that gets me. Really just floating it to get an idea of whether it would deter those who crash most tasks even if they gain a substantial amount of credit first. I certainly don't see it happening any time soon given the work Andy and Richard have put into sorting out the current system.
34) Message boards : Number crunching : Should full credit be given for time on non successful tasks? (Message 70675)
Posted 29 days ago by Profile Dave Jackson
Post:
I think that giving credit for non completed tasks based on trickle up messages is unique to CPDN. It originated when tasks taking four months or more was not unusual. Now the longest tasks still complete in under a month on an reasonably fast machine, most within two weeks running 24/7.

This was prompted by seeing a resend where one of the failures was on a machine that only completed about one in twenty tasks and sent a trickle up message every few days. I know the credit system has only just been rejigged so the credit script runs daily but I would like to pose the question as to whether we should move to a system of only granting credit for completed tasks? Do those crashing everything or almost everything ever look at their credit? I don't know. If they do, not getting any might prompt them to visit the fora to find out why everything is crashing? Just a thought.
35) Questions and Answers : Wish list : Website revamp. (Message 70674)
Posted 29 days ago by Profile Dave Jackson
Post:
The website should probably have something discouraging people who regularly turn off their machines from crunching this project as well. - I have just picked up a resend this morning, one machine, it didn't make the transition from global to regional model, I had that happen to one of mine in testing that is running fine on another machine. It is what it is. The other failure however is from a machine with lots of suspend requests and is completing less than 5% of the tasks it, often taking several days between trickle up messages/zip uploads. It would be helpful if potential crunchers for the project were aware of the nature of the project and that repeated machine reboots are likely with some model types to result in task failures.
36) Message boards : Number crunching : Top participants RAC (Message 70667)
Posted 23 Mar 2024 by Profile Dave Jackson
Post:
The server packet includes a script that updates RAC from inactive users/hosts/teams following the same method that is used when an active user/host/team reports work.
Project admins just need to run it periodically.
The suggestion is once a day.


Thanks. I admit to being even more ignorant about the server side of things than I am about the client and manager code. Even there, how things change when I compile it myself following the same procedure often baffle me. I have yet to try building my own client and or manager under Windows. - Linux keeps me busy enough on that score!
37) Message boards : Number crunching : Top participants RAC (Message 70662)
Posted 23 Mar 2024 by Profile Dave Jackson
Post:
The problem is that the "current figure" only changes when the user in question is actively processing new work

Is that what happens with most projects with the server code Richard? So a user who stops crunching with a very high average credit can remain near the top of the lists till eventually those with newer and faster computers can depose them?

Edit: Just thinking that if so, then a request for a change in the server code over at git-hub might be in order?
38) Message boards : Number crunching : New Work Announcements 2024 (Message 70657)
Posted 18 Mar 2024 by Profile Dave Jackson
Post:
Thanks Glen. I am down to one second attempt that came in this morning.
39) Questions and Answers : Wish list : Website revamp. (Message 70651)
Posted 12 Mar 2024 by Profile Dave Jackson
Post:
There are a number of things I can think of that might possibly be useful when this is done. I will list a few that spring to mind that might help those new to the project.

1. This project is not ideal for those who want to just sign up, then let things happen. Checking the forums regularly can be useful
2.There are often periods of time without any work interspersed with lots of work in a famine and glut pattern.
3. Unlike some projects throughput of work is unlikely to increase by using virtual cores.
4. Even on the very fastest processors, many tasks take several days to complete.

I will add to this list when things come to mind but please feel free to contribute or to second ideas that you think are the most important. It would be good to get ideas from the point of view of "ordinary" crunchers as well as those involved through moderation and or those with very advanced knowledge of computing, programming etc.
40) Questions and Answers : Windows : Question about my unit. (Message 70650)
Posted 12 Mar 2024 by Profile Dave Jackson
Post:
I think in future the plan is to send the abort signal from the project to abort tasks in that situation though not 100% sure. I am pretty sure there was a post about it in the number crunching section but a lot of people don't check the forums regularly to spot these things. (I only go to the forums of other projects when there is a problem and I would guess a lot of people are like that with CPDN.


Previous 20 · Next 20

©2024 climateprediction.net