|
Message boards :
Number crunching :
Upload failures
Message board moderation
Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · Next
Author | Message |
---|---|
![]() Send message Joined: 7 Aug 04 Posts: 2111 Credit: 58,047,187 RAC: 692 |
@Speedy The error message in stderr on the task page says "The system cannot find the drive specified.". This is an error that crops up occasionally. No one knows the cause. It's not typically reproduced in the other tasks in the work unit. It may be some kind of timing issue when the model tries to write to, or read from the disk. The error listing you pasted into your post are just because the model crashed before those monthly upload files are created. It was expecting to upload them and they were never generated. It's unfortunately not useful for finding the cause of the crash. |
Send message Joined: 5 Sep 04 Posts: 7609 Credit: 24,240,330 RAC: 2,564 |
Climate models have lots of files open, which all need saving at checkpoints. With your computer having so many processors, it will need a VERY fast HD to keep up with all that saving when it occurs at the same time. |
Send message Joined: 20 Jul 05 Posts: 25 Credit: 409,712 RAC: 0 |
Climate models have lots of files open, which all need saving at checkpoints. Thank you for pointing this out I have cut it down to working on three tasks at a time. I'm not sure but maybe when I turned my machine last night it was trying to upload a trickle message @Speedy The error message in stderr on the task page says "The system cannot find the drive specified.". This is an error that crops up occasionally. No one knows the cause. It's not typically reproduced in the other tasks in the work unit. It may be some kind of timing issue when the model tries to write to, or read from the disk. Thank you for explaining the error message it makes complete sense. Hopefully I will be able to complete other tasks without them crashing |
Send message Joined: 23 Feb 05 Posts: 4 Credit: 1,291,717 RAC: 1,166 |
I've had one cam25 zip transfer failing with 'transient HTTP error' since about Oct 26: https://www.cpdn.org/result.php?resultid=21743279 Can it be fixed or is the advice still to abort? |
Send message Joined: 5 Sep 04 Posts: 7609 Credit: 24,240,330 RAC: 2,564 |
Might as well abort, I can't see them finding what's wrong any time soon. |
Send message Joined: 23 Feb 05 Posts: 4 Credit: 1,291,717 RAC: 1,166 |
I logged onto the machine now intending to abort it and funnily enough the transfer had finally succeeded just a couple of hours ago. ¯\_(ツ)_/¯ |
Send message Joined: 5 Sep 04 Posts: 7609 Credit: 24,240,330 RAC: 2,564 |
That's good. After I posted here, I put a message on our project board, and the researcher in Mexico picked it up soon after and posted: Hi! there was a a brief shutdown the 26th of October (for maintenance), so that might be the cause. They're going to look into it when they're back in the office. I Thought: Great. Too late now. I'll post another internal message. |
Send message Joined: 23 Feb 05 Posts: 4 Credit: 1,291,717 RAC: 1,166 |
Very good thanks for sorting it out then. |
Send message Joined: 27 May 15 Posts: 3 Credit: 109,766 RAC: 0 |
Hi I can't get WUs to upload either Upload : Pending (project backoff> and then a timer) |
Send message Joined: 5 Sep 04 Posts: 7609 Credit: 24,240,330 RAC: 2,564 |
If you're talking about batch 869, PNW, then it's been closed. Sorry, but you took too long. Just abort them. |
![]() Send message Joined: 15 May 09 Posts: 3426 Credit: 10,438,930 RAC: 9,029 |
So out of curiosity, how much internet should I be using *per* task vs. drive space? Is it 2 gb per task which progressively gets smaller on the drive as the task moves along? Additionally I'm guessing that if an upload fails it would show up either a) in the tasks list on the website or b) in the boinc transfers window? Currently I have ten tasks on my Ryzen. disk usage for CPDN is 25.5GB. My laptop with just one task is currently using 3GB disk space. Uploads from memory are about 100MB for the zips on my Linux boxes. The only time this is a problem for me with bored as opposed to broad band is during zoom calls when I suspend internet activity for BOINC. I think there is also the facility to restrict the bandwidth used but as it is on the same machine I use for Zoom, I just make sure BOINC can't grab my bandwidth. Please do not private message myself or other moderators for help. This limits the number of people who are able to help and deprives others who may benefit from the answer. |
Send message Joined: 27 May 15 Posts: 3 Credit: 109,766 RAC: 0 |
If you're talking about batch 869, PNW, then it's been closed. Thanks Deadline of the Work Units was End of April this year - didn't know they become invalid if someone is quicker - good to know. |
Send message Joined: 5 Sep 04 Posts: 7609 Credit: 24,240,330 RAC: 2,564 |
The "deadline" that's often quoted, is just a very long time to stop BOINC from causing problems when shorter tasks from other projects are run at the same time. The ones that I mentioned, are the real deadlines here. And whether or not the long BOINC time is needed anymore is the subject of much discussion. |
Send message Joined: 6 Oct 06 Posts: 177 Credit: 7,162,291 RAC: 11,114 |
We (Pakistan, South Asia) have had the warmest winter that I have seen in the past 65 years. It might become a record of sorts. Let us say, no winter at all. Just Autumn, Spring type temps. |
Send message Joined: 9 Oct 20 Posts: 221 Credit: 1,789,184 RAC: 2,530 |
The "deadline" that's often quoted, is just a very long time to stop BOINC from causing problems when shorter tasks from other projects are run at the same time.At the risk of repeating what's already been said, if the server lies to the client and user about when the work must be done, it won't be done on time. If it has to be done in say 2 months, then set that as the deadline. Then the client will do it more urgently if it's nearing that time, and the user can know if their computer is too slow (or not switched on often enough) and wasting time doing these big tasks. Because of this nonsense, Merowig has just pointlessly wasted electricity running tasks that are now cancelled. |
Send message Joined: 9 Oct 20 Posts: 221 Credit: 1,789,184 RAC: 2,530 |
We (Pakistan, South Asia) have had the warmest winter that I have seen in the past 65 years. It might become a record of sorts. Let us say, no winter at all. Just Autumn, Spring type temps.It's snowing here. The scientists obviously haven't told the clouds about the new regulations. |
![]() Send message Joined: 22 Feb 06 Posts: 434 Credit: 19,406,456 RAC: 6,903 |
Would it be possible to post when batches are closed because enough results are in for the researchers to work on? |
Send message Joined: 5 Sep 04 Posts: 7609 Credit: 24,240,330 RAC: 2,564 |
Usually they're closed a long time after they're issued, to free up storage space. Several (all?) of the AFLAME batches have just been closed. This started in April 2019. More than enough time to crunch them. |
Send message Joined: 6 Oct 06 Posts: 177 Credit: 7,162,291 RAC: 11,114 |
We (Pakistan, South Asia) have had the warmest winter that I have seen in the past 65 years. It might become a record of sorts. Let us say, no winter at all. Just Autumn, Spring type temps.It's snowing here. The scientists obviously haven't told the clouds about the new regulations. ______________________ Peter, what do scientists have to do with clouds? Just return us our snow. Himalayas, Hindu Kush, Pamirs, etc. A sort of drought or to confuse drouth, you can take whichever word you can understand. ;) p) This year will be the warmest on record. . |
Send message Joined: 5 Sep 04 Posts: 7609 Credit: 24,240,330 RAC: 2,564 |
And this thread is not about weather. Personal observations about this should be in the Cafe section. |
©2022 climateprediction.net