climateprediction.net home page
Posts by MossyRock

Posts by MossyRock

1) Message boards : Number crunching : Abnormally long-running models (Message 57910)
Posted 8 Mar 2018 by MossyRock
Post:
Thank you, Dave.
2) Message boards : Number crunching : Abnormally long-running models (Message 57906)
Posted 8 Mar 2018 by MossyRock
Post:
Strange - during the entire time I have a batch of models running I am careful to not interrupt them in ANY way, including suspends. I learned THAT lesson a while back.

Since you're seeing evidence that a suspend occurred which caused a model failure, then it must have been when I was fiddling with this hung task, trying to get it going again (it was gaining no CPU time for most of the day, and showing NO time left until completion).
3) Message boards : Number crunching : Abnormally long-running models (Message 57896)
Posted 6 Mar 2018 by MossyRock
Post:
Thanks for the tips! I will try these if it happens again in the future with other models.
4) Message boards : Number crunching : Abnormally long-running models (Message 57890)
Posted 5 Mar 2018 by MossyRock
Post:
Thanks, Dave.

Do you know what went wrong with the second task that I listed? Did I do the right thing by aborting it?
5) Message boards : Number crunching : Abnormally long-running models (Message 57888)
Posted 5 Mar 2018 by MossyRock
Post:
Thanks for the info!
6) Message boards : Number crunching : Abnormally long-running models (Message 57884)
Posted 5 Mar 2018 by MossyRock
Post:
Hey,

This task below has been running on my Ryzen 7 machine for more than 8.5 days, and shows another 5.5 days remaining:

https://www.cpdn.org/cpdnboinc/result.php?resultid=21095973

Is this normal? Should I let it continue or should I abort it?

The current wah2 tasks on my machine run between 4 and 9 days, so 14 days running is quite unusual.

This NEXT task below had also been running for more than 8 days, and showed about 95% complete and "running," but it was actually stalled and was sitting there all day, accumulating no more run time. The "remaining" column showed "---". I aborted it. I didn't know what else to do.

https://www.cpdn.org/cpdnboinc/result.php?resultid=21095971
7) Message boards : Number crunching : Stuck upload issue (Message 57405)
Posted 27 Nov 2017 by MossyRock
Post:
I had the exact same problem with this same type of model.

See my recent thread here: https://www.cpdn.org/cpdnboinc/forum_thread.php?id=8516

You can wait for the server-end problem to be resolved at which time your files will upload, or you can abort the model showing as uploading in BM, then abort the transfers.

After waiting for days I decided to clean things up and do the aborts. If you report the server name that is having the problem perhaps the third party will fix it so your uploads will complete.
8) Message boards : Number crunching : Strange Trickle Upload Problem (Message 57396)
Posted 26 Nov 2017 by MossyRock
Post:
The model finished but all of the uploads were stuck. I aborted the uploads.

Thanks for investigating this.
9) Message boards : Number crunching : Strange Trickle Upload Problem (Message 57391)
Posted 24 Nov 2017 by MossyRock
Post:
Task 20901466 (Workunit 11361856) is currently running on one of my machines with about 19 hours left to go, and is showing 10 trickles for this task on my account's tasks web page.

However, those SAME 10 trickles are still showing in BM's transmit queue and each time it tries to re-upload them it gets the error:

"Temporarily failed upload of <filename.zip>: connect()failed

"Internet access ok - project servers may be temporarily down"

There have been no interruptions to this task, and other tasks are uploading trickles with no issues.

Why are they still in the queue if they've been transmitted?

Do I abort these queued transfers?
10) Message boards : Number crunching : New work Discussion (Message 57385)
Posted 22 Nov 2017 by MossyRock
Post:
Three of my most recently downloaded batch of 11 models have crashed. These models also crashed for my "wingmen" if that is an accurate term to use at CPDN.

Task 11361810 - wah2
Signal 11 received: Segment violation

Task 11339935 - pnw25
Unknown error

Task 11278191 - wah2
Signal 4 received: Illegal instruction - invalid function image
Signal 4 received: Floating point exception
Signal 4 received: Segment violation

Not sure what is going on. There have been no interruptions in processing at all (i.e., suspends, reboots, etc.).
11) Message boards : Number crunching : No Graphics (Message 56828)
Posted 12 Sep 2017 by MossyRock
Post:
Jim,

Thanks for your response.

The graphics provided much more than eye candy. It made the models come alive.

Graphics gave us feedback as to what was happening and what may happen in the future. As it stands now, we get very little feedback.

I would like to present BOINC to the local school system to see if there is any interest from their science departments, and was thinking of showcasing CPDN as a project of interest. However, without graphics I think the students would lose interest in it as it would just be jobs running in a job queue, with nothing to see and analyze.

Keeping people interested in this project should be a major concern, and graphics is definitely one way to do it.

Just my humble opinion...
12) Message boards : Number crunching : No Graphics (Message 56823)
Posted 11 Sep 2017 by MossyRock
Post:
Why are there no graphics with most or all of the models being released now?

I've check my settings and I cannot see anything that I've missed. I'm running SETI@Home alongside and its graphics are working fine.

Thanks.
13) Message boards : Number crunching : Validation Pending (Message 56768)
Posted 3 Sep 2017 by MossyRock
Post:
Thanks, geophi.
14) Message boards : Number crunching : Validation Pending (Message 56764)
Posted 3 Sep 2017 by MossyRock
Post:
I have 70 completed work units in the "validation pending" state, some of them recent that I just completed, but most of them are from years ago.

Just wondering about the validation process - when do new ones get validated, and why so I have so many that are still in this state.

Thanks.
15) Message boards : Number crunching : Miscellaneous problems (Message 54112)
Posted 16 May 2016 by MossyRock
Post:
Les,

I followed your advice for the steps to take for a reboot/restart and my WAH2 tasks survived.

Thank you.
16) Message boards : Number crunching : Miscellaneous problems (Message 54109)
Posted 16 May 2016 by MossyRock
Post:
Suspending work units individually causes new work units, that are ready to start, to begin running to "fill in the hole."

This can cause quite a mess, especially if there are new CPDN models in your queue that are ready to start. You will end up with the ones that were running originally, now suspended, plus the new ones that start that you have to suspend also.

Is there a way to prevent new work units from starting as you go about suspending work units individually?
17) Message boards : Number crunching : Miscellaneous problems (Message 54105)
Posted 15 May 2016 by MossyRock
Post:
Iain,

Thanks for your response.

Was BOINC Manager up and running at the time of the reboot, even though the WAH2 model was suspended?

There's a setting in Win 10 tell it NOT to reboot after updates until you explicitly give it the ok to do so - I have it set on my Win 10 machine that doesn't do any BOINC, and it has never rebooted on its own, at least, not yet.

Have you set this option on yours? If you have, are you saying that it went ahead and rebooted on its own without your input?
18) Message boards : Number crunching : Miscellaneous problems (Message 54097)
Posted 14 May 2016 by MossyRock
Post:
Ah, I just read Les' recommendation to suspend everything before exiting Boinc Manager.

Would doing this prevent the crashes I experenced?

Thanks.
19) Message boards : Number crunching : Miscellaneous problems (Message 54096)
Posted 14 May 2016 by MossyRock
Post:
I had 2 WAH2 tasks fail back-to-back after I restarted Boinc Manager after a machine reboot for Windows updates.

I always perform a normal, routine, controlled Boinc Manager shutdown before a reboot: File > Exit Boinc > Stop running tasks when exiting the BOINC Manager (checked).

The failed tasks are WUs 10371899 and 10351844.

Outcome: Client error
Client State: Compute error
Validate State: Invalid.

Here is what is in the Boinc Manager Event Log:

5/11/2016 7:40:32 AM | climateprediction.net | Message from task: 0
5/11/2016 7:40:32 AM | climateprediction.net | Computation for task wah2_eu25_d756_193612_13_366_010351844_0 finished
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_3.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_4.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_5.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_6.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_7.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_8.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_9.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_10.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_11.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_12.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_13.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:40:32 AM | climateprediction.net | Output file wah2_eu25_d756_193612_13_366_010351844_0_14.zip for task wah2_eu25_d756_193612_13_366_010351844_0 absent
5/11/2016 7:41:02 AM | climateprediction.net | Message from task: 0
5/11/2016 7:41:02 AM | climateprediction.net | Computation for task wah2_eu25_i76p_198612_13_366_010371899_1 finished
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_2.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_3.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_4.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_5.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_6.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_7.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_8.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_9.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_10.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_11.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_12.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_13.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 7:41:02 AM | climateprediction.net | Output file wah2_eu25_i76p_198612_13_366_010371899_1_14.zip for task wah2_eu25_i76p_198612_13_366_010371899_1 absent
5/11/2016 8:40:49 AM | climateprediction.net | Sending scheduler request: To report completed tasks.
5/11/2016 8:40:49 AM | climateprediction.net | Reporting 2 completed tasks

That's a lot of processing time down the drain.

Any ideas why this happened?

Thanks.
20) Message boards : Number crunching : Total Credit (Message 53904)
Posted 6 Apr 2016 by MossyRock
Post:
Some things never change... death, taxes, and CPDN credit problems.

I've been away from the project for over a year and returning I see that there's still endless discussion about CPDN credits. In fact, it still the most active - it's at the top of the non-sticky threads list, where it always has been.

I have participated in many different projects, and this is the only one that I have ever experienced credit issues with.

The other projects' credit systems just work.

For me, personally, I don't care much about whether I get CPDN credits or not.

But it would be nice if this can be fixed, once and for all. There's too much time and energy being wasted in this never-ending squabble.


Next 20

©2024 climateprediction.net