climateprediction.net home page
Stuck upload issue

Stuck upload issue

Message boards : Number crunching : Stuck upload issue
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Falcon4

Send message
Joined: 4 Nov 06
Posts: 5
Credit: 3,764,885
RAC: 296
Message 57670 - Posted: 18 Jan 2018, 20:43:52 UTC - in response to Message 57624.  

6 days later, I've still got the exact same files in exactly the same spot trying to upload...

wah2_cam25_a0ib_200405_18_691_011370208_0_r706661527_12.zip - 30.03% (31.56/105.10 MB)
wah2_cam25_a0ib_200405_18_691_011370208_0_r706661527_13.zip - 9.93% (10.45/105.17 MB)
wah2_cam25_a0jx_200405_18_691_011370266_0_r76671198_10.zip - 52.51% (55.16/105.13 MB)
wah2_cam25_a0ea_200405_18_691_011370063_0_r1526698435_14.zip - 31.20% (32.83/105.20 MB)
wah2_cam25_a0ht_200405_18_691_011370190_0_r177040747_1.zip - 21.42% (22.56/105.30 MB)

Not sure if I should try messing with it... I really just kinda want to abort/clear these tasks at this point, as it's taking longer to upload than it did to compute in the first place :(
ID: 57670 · Report as offensive     Reply Quote
Profile Alan K

Send message
Joined: 22 Feb 06
Posts: 488
Credit: 30,548,813
RAC: 6,200
Message 57671 - Posted: 18 Jan 2018, 23:01:09 UTC - in response to Message 57655.  

This is the message from that link just now:

<data_server_reply>
<status>1</status>
<message>no command</message>
</data_server_reply>

so no change at the moment.
ID: 57671 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4475
Credit: 18,448,326
RAC: 22,385
Message 57673 - Posted: 19 Jan 2018, 8:50:59 UTC

Will add another message to the batch's card.
ID: 57673 · Report as offensive     Reply Quote
BetelgeuseFive

Send message
Joined: 31 Aug 04
Posts: 10
Credit: 2,538,005
RAC: 0
Message 57682 - Posted: 21 Jan 2018, 9:44:43 UTC

I have two stuck uploads, one of them for over a week, the other for a couple of days:

21/01/2018 10:41:08 | climateprediction.net | Started upload of wah2_cam25_a047_200405_18_689_011368595_0_r2068114942_7.zip
21/01/2018 10:41:32 | | Project communication failed: attempting access to reference site
21/01/2018 10:41:32 | climateprediction.net | Temporarily failed upload of wah2_cam25_a047_200405_18_689_011368595_0_r2068114942_7.zip: transient HTTP error
21/01/2018 10:41:32 | climateprediction.net | Backing off 03:59:04 on upload of wah2_cam25_a047_200405_18_689_011368595_0_r2068114942_7.zip
21/01/2018 10:41:35 | | Internet access OK - project servers may be temporarily down.
21/01/2018 10:42:07 | climateprediction.net | Started upload of wah2_cam25_a03s_200405_18_689_011368580_0_r616315434_17.zip
21/01/2018 10:42:31 | | Project communication failed: attempting access to reference site
21/01/2018 10:42:31 | climateprediction.net | Temporarily failed upload of wah2_cam25_a03s_200405_18_689_011368580_0_r616315434_17.zip: transient HTTP error
21/01/2018 10:42:31 | climateprediction.net | Backing off 04:12:33 on upload of wah2_cam25_a03s_200405_18_689_011368580_0_r616315434_17.zip
21/01/2018 10:42:32 | | Internet access OK - project servers may be temporarily down.

Anything I can do about this on my side or does this need to be resolved on the server side ?

Tom
ID: 57682 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4475
Credit: 18,448,326
RAC: 22,385
Message 57683 - Posted: 21 Jan 2018, 15:38:24 UTC

Sometimes exiting BOINC manager and restarting it can shift stuck uploads but I suspect this is a server side problem.I messaged the project about it on Friday.
ID: 57683 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1084
Credit: 7,684,326
RAC: 4,454
Message 57686 - Posted: 22 Jan 2018, 0:30:20 UTC - in response to Message 57683.  
Last modified: 22 Jan 2018, 0:36:00 UTC

Sometimes exiting BOINC manager and restarting it can shift stuck uploads but I suspect this is a server side problem.I messaged the project about it on Friday.

... my CAM upload has been stuck since 7 January: that CAM upload server needs a good kick with a steel-toecapped Dr Martens. I had the same error with a couple of the 145 x ~50 MB Zips for global batch #696, but they cleared after a day or so. It would have been an aggravating waste of time to calculate and upload most - but not all - of over 7 GB data. So whatever was done to the "global" server needs to be done to the CAM server, sharpish.

Edit: As it is, my CAM model has calculated and uploaded 17/18 x 100 MB - i.e. ~2 GB - which I assume is useless to the project without the missing Zip. I'm tempted to abort it, so at least someone else could re-run it.
ID: 57686 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4475
Credit: 18,448,326
RAC: 22,385
Message 57690 - Posted: 22 Jan 2018, 8:56:04 UTC
Last modified: 22 Jan 2018, 9:36:36 UTC

So whatever was done to the "global" server needs to be done to the CAM server, sharpish.



I suggested a while ago that the uploads could be redirected to Oxford as was done with the batch destined for a Korean server. I have repeated that suggestion with a slightly stronger wording.

Edit:I have been told they don't have the capacity at Oxford to take those two batches at present but will investigate to see if there is a server somewhere else that can take the uploads.
ID: 57690 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2173
Credit: 64,760,426
RAC: 3,180
Message 57692 - Posted: 22 Jan 2018, 16:57:49 UTC

These stuck uploads have been happening occasionally for more than a few years now. But it seems to be the models with really big upload files that have the most problems. A server reboot seems to release the stuck uploads. The first time I remember a recurrent problem was with the ANZ runs back in March of 2014.
ID: 57692 · Report as offensive     Reply Quote
tat

Send message
Joined: 11 Jan 18
Posts: 5
Credit: 129,961
RAC: 0
Message 57748 - Posted: 31 Jan 2018, 18:49:02 UTC
Last modified: 31 Jan 2018, 18:50:01 UTC

Hi. Am new to the project and so am not sure whether the upload issue that I'm experiencing, is out of the ordinary or not.

The tasks reach 100% according to the transfer tab in boinc manager but then the project goes into backoff, before then starting the counter again at 0%

Have emboldened the terms I'm not used to seeing in the event log below. Thanks

    31/01/2018 16:50:47 | climateprediction.net | Started upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip
    31/01/2018 16:50:47 | climateprediction.net | Started upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip
    31/01/2018 16:50:47 | climateprediction.net | Sending scheduler request: To send trickle-up message.
    31/01/2018 16:50:47 | climateprediction.net | Not requesting tasks: "no new tasks" requested via Manager
    31/01/2018 16:50:49 | climateprediction.net | Scheduler request completed
    31/01/2018 16:52:08 | climateprediction.net | [error] Error reported by file upload server: can't write file wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip: Disk quota exceeded
    31/01/2018 16:52:08 | climateprediction.net | Temporarily failed upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip: transient upload error
    31/01/2018 16:52:08 | climateprediction.net | Backing off 00:02:04 on upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip
    31/01/2018 16:52:09 | climateprediction.net | [error] Error reported by file upload server: can't write file wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip: Disk quota exceeded
    31/01/2018 16:52:09 | climateprediction.net | Temporarily failed upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip: transient upload error
    31/01/2018 16:52:09 | climateprediction.net | Backing off 00:02:28 on upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip
    31/01/2018 16:54:13 | climateprediction.net | Started upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip
    31/01/2018 16:54:38 | climateprediction.net | Started upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip
    31/01/2018 16:55:07 | climateprediction.net | [error] Error reported by file upload server: can't write file wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip: Disk quota exceeded
    31/01/2018 16:55:07 | climateprediction.net | Temporarily failed upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip: transient upload error
    31/01/2018 16:55:07 | climateprediction.net | Backing off 00:06:38 on upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip
    31/01/2018 16:55:37 | climateprediction.net | [error] Error reported by file upload server: can't write file wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip: Disk quota exceeded
    31/01/2018 16:55:37 | climateprediction.net | Temporarily failed upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip: transient upload error
    31/01/2018 16:55:37 | climateprediction.net | Backing off 00:04:53 on upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip
    31/01/2018 17:24:21 | climateprediction.net | Started upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip
    31/01/2018 17:24:21 | climateprediction.net | Started upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip
    31/01/2018 17:26:22 | climateprediction.net | [error] Error reported by file upload server: can't write file wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip: Disk quota exceeded
    31/01/2018 17:26:22 | climateprediction.net | Temporarily failed upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip: transient upload error
    31/01/2018 17:26:22 | climateprediction.net | Backing off 00:08:57 on upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip
    31/01/2018 17:26:28 | climateprediction.net | [error] Error reported by file upload server: can't write file wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip: Disk quota exceeded
    31/01/2018 17:26:28 | climateprediction.net | Temporarily failed upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip: transient upload error
    31/01/2018 17:26:28 | climateprediction.net | Backing off 00:11:09 on upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip
    31/01/2018 17:35:20 | climateprediction.net | Started upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip
    31/01/2018 17:37:26 | climateprediction.net | [error] Error reported by file upload server: can't write file wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip: Disk quota exceeded
    31/01/2018 17:37:26 | climateprediction.net | Temporarily failed upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip: transient upload error
    31/01/2018 17:37:26 | climateprediction.net | Backing off 00:25:42 on upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip
    31/01/2018 17:49:14 | climateprediction.net | Started upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip
    31/01/2018 17:50:56 | climateprediction.net | [error] Error reported by file upload server: can't write file wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip: Disk quota exceeded
    31/01/2018 17:50:56 | climateprediction.net | Temporarily failed upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip: transient upload error
    31/01/2018 17:50:56 | climateprediction.net | Backing off 00:24:50 on upload of wah2_sas50_q2jw_201612_13_707_011450425_1_r573430588_8.zip
    31/01/2018 18:12:22 | climateprediction.net | Started upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip
    31/01/2018 18:14:40 | climateprediction.net | [error] Error reported by file upload server: can't write file wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip: Disk quota exceeded
    31/01/2018 18:14:40 | climateprediction.net | Temporarily failed upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip: transient upload error
    31/01/2018 18:14:40 | climateprediction.net | Backing off 00:43:48 on upload of wah2_sas50_qbkv_201612_13_708_011462124_0_r1129882916_5.zip

ID: 57748 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 57750 - Posted: 31 Jan 2018, 19:08:37 UTC - in response to Message 57748.  
Last modified: 31 Jan 2018, 19:10:17 UTC

I'm experiencing similar problems over the last half-day. I'm not yet aware of the problem(s).

Hang in with us. These things get sorted out sooner or later. (Servers are located around the globe and problems could be anywhere and of any type. [Curiously, I'm experiencing problems with EAS, SAS, EU, and NAM tasks -- currently, one task is uploading on each machine, albeit very slowly.])

Welcome to the project and the boards.
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 57750 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 25,510,211
RAC: 5,420
Message 57751 - Posted: 31 Jan 2018, 19:09:16 UTC - in response to Message 57748.  
Last modified: 31 Jan 2018, 19:10:29 UTC

I'm having the same issue. It's has been like this for 12hours (8:00 CET) Mine all are SAS50 batch 708, 707
ID: 57751 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4475
Credit: 18,448,326
RAC: 22,385
Message 57754 - Posted: 31 Jan 2018, 20:53:22 UTC - in response to Message 57751.  
Last modified: 31 Jan 2018, 20:53:52 UTC

I noticed this morning, the site was down for a couple of hours. This may or may not be relevant. I have had one upload from 706 and one from 708 go through after the site came back on stream. They went through at the normal speed for me which is total upload speed of about 100Mbit/second but that is I suspect a limitation of my connection rather than anything to do with the project.
ID: 57754 · Report as offensive     Reply Quote
Speedy

Send message
Joined: 20 Jul 05
Posts: 25
Credit: 414,873
RAC: 406
Message 57755 - Posted: 31 Jan 2018, 22:40:53 UTC - in response to Message 57754.  

I noticed this morning, the site was down for a couple of hours. This may or may not be relevant. I have had one upload from 706 and one from 708 go through after the site came back on stream. They went through at the normal speed for me which is total upload speed of about 100Mbit/second but that is I suspect a limitation of my connection rather than anything to do with the project.

Dave I am curious to know how long take you to upload 45.97 meg looking at my manager takes just over 12 minutes but currently the project is in back off mode
ID: 57755 · Report as offensive     Reply Quote
Lockleys

Send message
Joined: 13 Jan 07
Posts: 195
Credit: 10,581,566
RAC: 0
Message 57756 - Posted: 1 Feb 2018, 0:03:43 UTC

I am having the same problem but they all eventually upload with the exception of a restart.zip for a cam25 model, batch 691 which is stuck at 27.63 percent.
ID: 57756 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4475
Credit: 18,448,326
RAC: 22,385
Message 57757 - Posted: 1 Feb 2018, 7:05:59 UTC - in response to Message 57755.  

Dave I am curious to know how long take you to upload 45.97 meg looking at my manager takes just over 12 minutes but currently the project is in back off mode


Mine took 18minutes but there were two uploads going at once so I guess you can half that. Worth noting that I don't get any faster than that when I run a speed test on my connection.
ID: 57757 · Report as offensive     Reply Quote
keputnam

Send message
Joined: 31 Aug 04
Posts: 26
Credit: 3,954,540
RAC: 679
Message 57758 - Posted: 1 Feb 2018, 7:06:50 UTC

I'm now getting

1/31/2018 11:03:42 PM | climateprediction.net | [error] Error reported by file upload server: can't write file wah2_sas50_q3b2_201612_13_707_011451403_0_r448480803_11.zip: Disk quota exceeded

<sigh>

if it's not one project with capacity problems, it's another

ID: 57758 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 57759 - Posted: 1 Feb 2018, 9:49:26 UTC - in response to Message 57758.  

The project team are aware of the upload problem for WAH2 SAS50 batch 706, 707 and 708 files. Upload server upload2.cpdn.org has reached its disk quota and they are making more space available.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 57759 · Report as offensive     Reply Quote
tat

Send message
Joined: 11 Jan 18
Posts: 5
Credit: 129,961
RAC: 0
Message 57762 - Posted: 1 Feb 2018, 13:28:13 UTC

Selfishly speaking, I'm glad it's not just me ;)

Thanks for the responses. And to those working hard on our behalf.
ID: 57762 · Report as offensive     Reply Quote
WB8ILI

Send message
Joined: 1 Sep 04
Posts: 161
Credit: 81,488,986
RAC: 868
Message 57764 - Posted: 1 Feb 2018, 18:52:41 UTC

I have had a stuck upload from batch 691 for so many weeks I have lost track.
ID: 57764 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 391
Credit: 219,896,461
RAC: 649
Message 57765 - Posted: 2 Feb 2018, 7:49:17 UTC - in response to Message 57759.  
Last modified: 2 Feb 2018, 7:56:05 UTC

The project team are aware of the upload problem for WAH2 SAS50 batch 706, 707 and 708 files. Upload server upload2.cpdn.org has reached its disk quota and they are making more space available.


Thanks for update.
Also thanks for knowing upload server managers have no clue from project as to what space provision needed.

Us contributors would like to think the infrastructure is well-managed.
Obviously not

I keep on crunching - knowing how underpaid and overworked the support staff are
ID: 57765 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Stuck upload issue

©2024 cpdn.org