climateprediction.net home page
Stuck upload issue

Stuck upload issue

Message boards : Number crunching : Stuck upload issue
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Mephist0

Send message
Joined: 21 Feb 08
Posts: 47
Credit: 7,929,915
RAC: 0
Message 57572 - Posted: 5 Jan 2018, 14:06:09 UTC
Last modified: 5 Jan 2018, 14:11:56 UTC

Uploading eas50 7,2GB data at the moment.. Its going fine! Speed 2MiB/s per file.. total speed 8MiB/s :D

Great job guys!
ID: 57572 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4309
Credit: 16,355,267
RAC: 6,278
Message 57573 - Posted: 5 Jan 2018, 15:01:33 UTC

Looks as if there is a problem with upload6 as well:


On investigation this one seems to be working. The transient http error may be just coincidence of lots of uploads going to it. If problem is widespread someone will increase the timeout value on the server which should deal with it.

Good news that the other uploads are going through. - Some may experience transient errors on upload7 as well with lots of people's uploads trying to get through.
ID: 57573 · Report as offensive     Reply Quote
Doug Jenkins
Avatar

Send message
Joined: 28 Aug 04
Posts: 2
Credit: 2,472,174
RAC: 515
Message 57574 - Posted: 5 Jan 2018, 15:17:57 UTC - in response to Message 57573.  

Excellent, thank you! Mine are all uploaded now.
ID: 57574 · Report as offensive     Reply Quote
Alex Plantema

Send message
Joined: 3 Sep 04
Posts: 126
Credit: 26,363,193
RAC: 0
Message 57580 - Posted: 5 Jan 2018, 22:00:36 UTC

My files have been uploaded except one, which got stuck at 14.96%:
https://www.cpdn.org/cpdnboinc/result.php?resultid=20936151

vr 05-01-2018 16:22:13 | climateprediction.net | Started upload of wah2_cam25_a07g_200405_18_689_011368712_1_r81990675_6.zip
vr 05-01-2018 16:22:38 | | Project communication failed: attempting access to reference site
vr 05-01-2018 16:22:38 | climateprediction.net | Temporarily failed upload of wah2_cam25_a07g_200405_18_689_011368712_1_r81990675_6.zip: transient HTTP error
vr 05-01-2018 16:22:38 | climateprediction.net | Backing off 04:46:18 on upload of wah2_cam25_a07g_200405_18_689_011368712_1_r81990675_6.zip
vr 05-01-2018 16:22:40 | | Internet access OK - project servers may be temporarily down.
ID: 57580 · Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 29 Nov 17
Posts: 55
Credit: 6,483,504
RAC: 2,069
Message 57582 - Posted: 6 Jan 2018, 9:02:43 UTC - in response to Message 57580.  

I still have one task that is stuck uploading to upload6, it has sent 6.66/104.92MB so far...

06/01/2018 08:54:02 | climateprediction.net | [fxd] starting upload, upload_offset -1
06/01/2018 08:54:02 | climateprediction.net | Started upload of wah2_cam25_a03y_200405_18_689_011368586_0_r102364835_9.zip
06/01/2018 08:54:02 | climateprediction.net | [file_xfer] URL: http://upload6.cpdn.org/cgi-bin/file_upload_handler
06/01/2018 08:54:03 | climateprediction.net | [file_xfer] http op done; retval 0 (Success)
06/01/2018 08:54:03 | climateprediction.net | [file_xfer] parsing upload response: <data_server_reply>    <status>0</status>    <file_size>6964580</file_size></data_server_reply>
06/01/2018 08:54:03 | climateprediction.net | [file_xfer] parsing status: 0
06/01/2018 08:54:03 | climateprediction.net | [fxd] starting upload, upload_offset 6964580
06/01/2018 08:54:27 | climateprediction.net | [file_xfer] http op done; retval -184 (transient HTTP error)
06/01/2018 08:54:27 | climateprediction.net | [file_xfer] file transfer status -184 (transient HTTP error)
06/01/2018 08:54:27 | climateprediction.net | Temporarily failed upload of wah2_cam25_a03y_200405_18_689_011368586_0_r102364835_9.zip: transient HTTP error
06/01/2018 08:54:27 | climateprediction.net | Backing off 03:42:03 on upload of wah2_cam25_a03y_200405_18_689_011368586_0_r102364835_9.zip
ID: 57582 · Report as offensive     Reply Quote
Koert

Send message
Joined: 5 Sep 07
Posts: 9
Credit: 10,783,131
RAC: 0
Message 57583 - Posted: 6 Jan 2018, 15:38:13 UTC

All stuck ones are uploaded now.Thanks.
ID: 57583 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 9 Dec 05
Posts: 110
Credit: 12,038,780
RAC: 1,393
Message 57586 - Posted: 6 Jan 2018, 17:16:51 UTC

I also have one of those wah2_cam25... files stuck in upload (_2.zip). It has been at 42.3% for over a week now. The task is otherwise finished and all other zips have uploaded but this one is not budging. When it retries it starts from about 42.1% and stops again at 42.3%. Restarting Boinc, cable modem or the Host does not help. Here's the task: https://www.cpdn.org/cpdnboinc/result.php?resultid=20921964
ID: 57586 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4309
Credit: 16,355,267
RAC: 6,278
Message 57594 - Posted: 8 Jan 2018, 12:24:36 UTC

I will let the project people know. (First will check on relevant board to see if anyone else has.)
ID: 57594 · Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 29 Nov 17
Posts: 55
Credit: 6,483,504
RAC: 2,069
Message 57595 - Posted: 8 Jan 2018, 12:50:40 UTC - in response to Message 57582.  

I still have one task that is stuck uploading to upload6, it has sent 6.66/104.92MB so far...

06/01/2018 08:54:02 | climateprediction.net | Started upload of wah2_cam25_a03y_200405_18_689_011368586_0_r102364835_9.zip

This one is still stuck, I'll post when it has gone rather than keep saying it hasn't.

PS. Thanks for the credit export done this morning.
ID: 57595 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4309
Credit: 16,355,267
RAC: 6,278
Message 57596 - Posted: 8 Jan 2018, 14:25:29 UTC - in response to Message 57595.  

I have let Sarah know that at least 5 people have one or more stuck uploads from the cam25 batches. I have checked out a few of them myself and the tasks should have finished as all 18 trickle files have been uploaded. I haven't been able to work out any pattern as the numbers of the zip files that are stuck differ between computers. My guess is the server involved is connected via a connection that can't handle the amount of data being thrown at it and the overload may well be nothing to do with CPDN but something else going through it.

The project may decide on their own to do a temporary re-direct to Oxford. If not, I may suggest it in a few days time if nothing improves.
ID: 57596 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1079
Credit: 6,837,697
RAC: 6,537
Message 57602 - Posted: 8 Jan 2018, 16:51:11 UTC

The CAM25 upload error I'm getting is:

07/01/2018 16:07:47 | climateprediction.net | [error] Error reported by file upload server: [wah2_cam25_a033_200405_18_689_011368555_0_r1524206355_14.zip] locked by file_upload_handler PID=102372

Maybe just an upload server reboot to clear the locks? Subsequent Zips from that model have uploaded without any problem.
ID: 57602 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 9 Dec 05
Posts: 110
Credit: 12,038,780
RAC: 1,393
Message 57608 - Posted: 8 Jan 2018, 19:06:19 UTC - in response to Message 57586.  

I also have one of those wah2_cam25... files stuck in upload (_2.zip). It has been at 42.3% for over a week now. The task is otherwise finished and all other zips have uploaded but this one is not budging. When it retries it starts from about 42.1% and stops again at 42.3%. Restarting Boinc, cable modem or the Host does not help. Here's the task: https://www.cpdn.org/cpdnboinc/result.php?resultid=20921964

The stuck file I reported before has been uploaded only to be replaced with an other stuck one from cam25 (_3.zip) here: https://www.cpdn.org/cpdnboinc/result.php?resultid=20921445
ID: 57608 · Report as offensive     Reply Quote
Profile Alan K

Send message
Joined: 22 Feb 06
Posts: 484
Credit: 29,559,333
RAC: 6,497
Message 57611 - Posted: 9 Jan 2018, 23:03:54 UTC - in response to Message 57608.  
Last modified: 9 Jan 2018, 23:06:21 UTC

I have a stuck zip from this lot - has been for 6 days now on 38% (40Mb out of 105). Its _17.zip.
ID: 57611 · Report as offensive     Reply Quote
Falcon4

Send message
Joined: 4 Nov 06
Posts: 5
Credit: 3,764,885
RAC: 296
Message 57624 - Posted: 12 Jan 2018, 22:47:54 UTC
Last modified: 12 Jan 2018, 22:55:45 UTC

I've been having upload/download issues for a while now... I have a 16-core (32 w/ HT) system I'm trying to keep busy, and CPDN just barely trickles me any work with its priority of 1500. The computer has another pairing with World Community Grid set to priority 0, and its task list is consistently full of either "Uploading" CPDN tasks, or running WCG tasks... but rarely ever more than a couple CPDN tasks. The uploads of tasks takes days, not just hours, because of the constant failures.

Is there just not enough work to do...?

tech details: I've got 200405_18_691..._*.zip items queued:
wah2_cam25_a0ib_18_691_011370208_0_r706661527_12.zip - 30.03% (31.56/105.10 MB)
wah2_cam25_a0ib_18_691_011370208_0_r706661527_13.zip - 9.93% (10.45/105.17 MB)
wah2_cam25_a0jx_18_691_011370266_0_r76671198_10.zip - 52.51% (55.16/105.13 MB)
wah2_cam25_a0ea_18_691_011370063_0_r1526698435_14.zip - 31.20% (32.83/105.20 MB)
wah2_cam25_a0ht_18_691_011370190_0_r177040747_1.zip - 21.42% (22.56/105.30 MB)
ID: 57624 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4309
Credit: 16,355,267
RAC: 6,278
Message 57626 - Posted: 13 Jan 2018, 8:20:14 UTC - in response to Message 57624.  

I will post another message to the project about the stuck uploads though it won't be seen till Monday probably.

If you click on main page at the bottom of this page, then on server status on the right, you will see that there is currently no work waiting to go out. (look at the number of tasks by model type rather than the 27 towards the top which really means zero.

This seems to be a BOINC server bug as other projects that go down to zero tasks in the queue show it as well.

There is stuff happening in the testing branch (where crunchers don't even get credit for their work) and I hope some of it will make it across to the main site next week but I don't have an inside track on how likely that is.
ID: 57626 · Report as offensive     Reply Quote
Profile Alan K

Send message
Joined: 22 Feb 06
Posts: 484
Credit: 29,559,333
RAC: 6,497
Message 57631 - Posted: 13 Jan 2018, 23:16:44 UTC - in response to Message 57611.  

Still waiting to finish upload though all 18 trickles are present. Server has an .mx location and just gives an Apache test page when looked at in a browser.
ID: 57631 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1079
Credit: 6,837,697
RAC: 6,537
Message 57633 - Posted: 14 Jan 2018, 0:17:15 UTC - in response to Message 57631.  

Still waiting to finish upload though all 18 trickles are present. Server has an .mx location and just gives an Apache test page when looked at in a browser.

Thanks, Alan. The request for the server to be fixed has been made again, but no news yet. The '.mx' makes sense as CAM is central America. There's clearly a significant problem server-side.
ID: 57633 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4309
Credit: 16,355,267
RAC: 6,278
Message 57634 - Posted: 14 Jan 2018, 8:42:24 UTC - in response to Message 57633.  

The batch gong to a korean server was redirected to Oxford. This could be done with these as well though I don't know how much leeway there is for this if any more servers away from UK stop working!
ID: 57634 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4309
Credit: 16,355,267
RAC: 6,278
Message 57655 - Posted: 16 Jan 2018, 16:14:10 UTC

Still waiting to finish upload though all 18 trickles are present. Server has an .mx location and just gives an Apache test page when looked at in a browser.


Is this the link you are using? http://upload6.cpdn.org/cgi-bin/file_upload_handler - Andy says that is the address to look at.

If so, it is now giving,

<data_server_reply>
<status>1</status>
<message>no command</message>
</data_server_reply>


If it was the address you looked at, things may now be working.
ID: 57655 · Report as offensive     Reply Quote
Profile Alan K

Send message
Joined: 22 Feb 06
Posts: 484
Credit: 29,559,333
RAC: 6,497
Message 57656 - Posted: 16 Jan 2018, 22:57:56 UTC - in response to Message 57655.  

Log file extract:

16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Info: Connected to upload6.cpdn.org (158.97.9.11) port 80 (#5)
16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Sent header to server: POST /cgi-bin/file_upload_handler HTTP/1.1
16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Sent header to server: Host: upload6.cpdn.org
16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.8.3)
16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Sent header to server: Accept: */*
16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Sent header to server: Accept-Encoding: deflate, gzip
16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Sent header to server: Content-Type: application/x-www-form-urlencoded
16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Sent header to server: Accept-Language: en_GB
16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Sent header to server: Content-Length: 68301092
16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Sent header to server: Expect: 100-continue
16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Sent header to server:
16/01/2018 20:36:28 | climateprediction.net | [http] [ID#5] Received header from server: HTTP/1.1 100 Continue
16/01/2018 20:36:51 | climateprediction.net | [http] [ID#5] Info: Recv failure: Connection was reset
16/01/2018 20:36:51 | climateprediction.net | [http] [ID#5] Info: Closing connection 5
16/01/2018 20:36:51 | climateprediction.net | [http] HTTP error: Failure when receiving data from the peer
16/01/2018 20:36:52 | climateprediction.net | [file_xfer] http op done; retval -184 (transient HTTP error)
16/01/2018 20:36:52 | climateprediction.net | [file_xfer] file transfer status -184 (transient HTTP error)
16/01/2018 20:36:52 | climateprediction.net | Temporarily failed upload of wah2_cam25_a0ds_200405_18_691_011370045_1_r1231016019_17.zip: transient HTTP error
so I think Andy is correct. Maybe there is a large backlog.
ID: 57656 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Stuck upload issue

©2024 climateprediction.net