climateprediction.net home page
Batch 996 Weather@Home2 East Asia25

Batch 996 Weather@Home2 East Asia25

Message boards : Number crunching : Batch 996 Weather@Home2 East Asia25
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 12 · Next

AuthorMessage
zombie67 [MM]
Avatar

Send message
Joined: 2 Oct 06
Posts: 52
Credit: 26,209,214
RAC: 3,355
Message 69779 - Posted: 12 Oct 2023, 15:11:59 UTC

Of the 54 tasks I have in progress, 3 have now completed. But they are stuck uploading, and cannot report. Is this what is going to happen to all 54 tasks? Am I wasting my time and energy cr4unching these tasks? Or is there a solution in the works, about to be rolled out?
ID: 69779 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4348
Credit: 16,541,921
RAC: 6,087
Message 69782 - Posted: 12 Oct 2023, 15:30:56 UTC - in response to Message 69779.  

Of the 54 tasks I have in progress, 3 have now completed. But they are stuck uploading, and cannot report. Is this what is going to happen to all 54 tasks? Am I wasting my time and energy cr4unching these tasks? Or is there a solution in the works, about to be rolled out?


Don't know where we are on that at the moment. I can see on the server status page an increase in users reporting tasks so obviously some are getting through. (8 tasks successfully completed and reported when page updated around 15 hours ago.) I know Richard and Andy have been looking at it. I am sure that as soon as they know anything for certain they will let us know.) I only have 9 tasks running and all uploads have gone through with only a handful of retrys necessary. This means that my machine is useless for trying to work out what the problem is.
ID: 69782 · Report as offensive     Reply Quote
ChelseaOilman

Send message
Joined: 24 Dec 19
Posts: 28
Credit: 27,767,688
RAC: 220,325
Message 69785 - Posted: 12 Oct 2023, 16:39:13 UTC - in response to Message 69776.  

What are these restart zip files I'm seeing?
They are files generated by a lot of CPDN tasks thatI think can be used to generate further tasks. They don't however always get used. More often than not they are generated at the end of a task rather than half way through as in this and the previous batch.

I have 19 zip files that won't upload. 3 of those are restart zips. This is on just 1 of my computers. Are you using the Krembil server? Seems just about as reliable. Clearly who ever is providing the server services either doesn't know what they're doing or don't have the capacity to handle these large files.
ID: 69785 · Report as offensive     Reply Quote
rob

Send message
Joined: 5 Jun 09
Posts: 79
Credit: 3,038,562
RAC: 4,077
Message 69786 - Posted: 12 Oct 2023, 16:41:15 UTC
Last modified: 12 Oct 2023, 17:19:47 UTC

Until about two hours ago all my transfers were going through on their first try. Then I had three files of ~94MB to transfer and all stopped. I was able to move one out of the queue, but the other two are now in an eternal re-try cycle. The event log shows most attempts are for a number of ~16kb blocks:
[size=9]12/10/2023 15:31:43 | climateprediction.net | Temporarily failed upload of wah2_eas25_a2kg_200112_24_996_012226876_1_r917371817_8.zip: transient HTTP error
12/10/2023 15:31:43 | climateprediction.net | Backing off 00:02:31 on upload of wah2_eas25_a2kg_200112_24_996_012226876_1_r917371817_8.zip
12/10/2023 15:31:43 |  | [http_xfer] [ID#0] HTTP: wrote 16384 bytes
12/10/2023 15:31:43 |  | [http_xfer] [ID#0] HTTP: wrote 16384 bytes
12/10/2023 15:31:43 |  | [http_xfer] [ID#0] HTTP: wrote 16384 bytes
12/10/2023 15:31:43 |  | [http_xfer] [ID#0] HTTP: wrote 16384 bytes
12/10/2023 15:31:43 |  | [http_xfer] [ID#0] HTTP: wrote 16384 bytes
12/10/2023 15:31:43 |  | [http_xfer] [ID#0] HTTP: wrote 10694 bytes
12/10/2023 15:31:44 |  | Internet access OK - project servers may be temporarily down.[/size]


Great "fun", but not much help. I would guess that this is a server-side problem since it affects a fair number of users from various parts of the world.
ID: 69786 · Report as offensive     Reply Quote
rob

Send message
Joined: 5 Jun 09
Posts: 79
Credit: 3,038,562
RAC: 4,077
Message 69787 - Posted: 12 Oct 2023, 16:49:32 UTC - in response to Message 69786.  

Ah, now a new message:
12/10/2023 17:44:57 | climateprediction.net | Started upload of wah2_eas25_a21e_199812_24_996_012226190_0_r1771898375_21.zip
12/10/2023 17:45:04 |  | [http_xfer] [ID#125] HTTP: wrote 99 bytes
12/10/2023 17:45:05 |  | [http_xfer] [ID#126] HTTP: wrote 192 bytes
12/10/2023 17:45:05 | climateprediction.net | [error] Error reported by file upload server: [wah2_eas25_a21e_199812_24_996_012226190_0_r1771898375_21.zip] locked by file_upload_handler PID=4075002
12/10/2023 17:45:05 | climateprediction.net | Temporarily failed upload of wah2_eas25_a21e_199812_24_996_012226190_0_r1771898375_21.zip: transient upload error
12/10/2023 17:45:05 | climateprediction.net | Backing off 04:07:02 on upload of wah2_eas25_a21e_199812_24_996_012226190_0_r1771898375_21.zip


Not seen this bit before:
12/10/2023 17:45:05 | climateprediction.net | [error] Error reported by file upload server: [wah2_eas25_a21e_199812_24_996_012226190_0_r1771898375_21.zip] locked by file_upload_handler PID=4075002
12/10/2023 17:45:05 | climateprediction.net | Temporarily failed upload of wah2_eas25_a21e_199812_24_996_012226190_0_r1771898375_21.zip: transient upload error

ID: 69787 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4348
Credit: 16,541,921
RAC: 6,087
Message 69788 - Posted: 12 Oct 2023, 17:02:25 UTC

I would guess that this is a server-side problem since it affects a fair number of suers from various parts of the world.


Yes it is the Korean server. The science results for CPDN tasks go to servers across the world where scientists have commissioned Oxford to distribute the work. East Asia tasks tend to go to Korea, ANZ tasks to Hobart etc. My next uploads should come in about ninety minutes. ( a 12.zip and a restart. If I don't get sidetracked by non BOINC activity I shall put http debug on just before they are due to go. But knowing my luck they will go through fine as the rest of mine have so far so I won't learn anything.
ID: 69788 · Report as offensive     Reply Quote
ChelseaOilman

Send message
Joined: 24 Dec 19
Posts: 28
Credit: 27,767,688
RAC: 220,325
Message 69789 - Posted: 12 Oct 2023, 18:20:25 UTC - in response to Message 69788.  

I shall put http debug on just before they are due to go.

Is that something I can do here? If so, how?
ID: 69789 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4348
Credit: 16,541,921
RAC: 6,087
Message 69790 - Posted: 12 Oct 2023, 19:49:17 UTC - in response to Message 69789.  
Last modified: 12 Oct 2023, 20:02:06 UTC

Is that something I can do here? If so, how?
From the manager, options>Event log options> then tick http_debug

Do untick it afterwards though as it produces a lot of output you don't want your event log filling up with when not trying to diagnose http problems.

Having enabled it my 12_zip has gone through and the restart is at 95% and counting (slowly on my bored band.) Edit: restart file also gone through with no problems.

This may sound completely mad but could it be that the problem is more likely to show up for those with fast connections? I can't think why that should be the case but just throwing ideas out.
ID: 69790 · Report as offensive     Reply Quote
ChelseaOilman

Send message
Joined: 24 Dec 19
Posts: 28
Credit: 27,767,688
RAC: 220,325
Message 69791 - Posted: 12 Oct 2023, 20:40:50 UTC
Last modified: 12 Oct 2023, 20:42:45 UTC

This is from my computer that has 19 zip files that won't upload. I normally have max file xfers set to 8 in cc_config.xml but set that to 1 a couple days ago hopeing it would help. I have Starlink internet.

10/12/2023 2:23:33 PM |  | Account manager contact succeeded
10/12/2023 2:25:10 PM |  | Re-reading cc_config.xml
10/12/2023 2:25:10 PM |  | log flags: file_xfer, sched_ops, task, http_xfer_debug
10/12/2023 2:25:17 PM |  | Re-reading cc_config.xml
10/12/2023 2:25:17 PM |  | log flags: file_xfer, sched_ops, task, http_xfer_debug
10/12/2023 2:25:53 PM |  | [http_xfer] [ID#1] HTTP: wrote 2256 bytes
10/12/2023 2:26:14 PM | climateprediction.net | Started upload of wah2_eas25_a04e_198512_24_996_012223706_0_r1503549291_1.zip
10/12/2023 2:26:14 PM |  | [http_xfer] [ID#58183] HTTP: wrote 100 bytes
10/12/2023 2:26:41 PM |  | Project communication failed: attempting access to reference site
10/12/2023 2:26:41 PM | climateprediction.net | Temporarily failed upload of wah2_eas25_a04e_198512_24_996_012223706_0_r1503549291_1.zip: transient HTTP error
10/12/2023 2:26:41 PM | climateprediction.net | Backing off 04:00:28 on upload of wah2_eas25_a04e_198512_24_996_012223706_0_r1503549291_1.zip
10/12/2023 2:26:41 PM | climateprediction.net | Started upload of wah2_eas25_a3zq_201012_24_996_012228722_1_r612358962_1.zip
10/12/2023 2:26:42 PM |  | [http_xfer] [ID#0] HTTP: wrote 12594 bytes
10/12/2023 2:26:42 PM |  | [http_xfer] [ID#0] HTTP: wrote 8 bytes
10/12/2023 2:26:42 PM |  | [http_xfer] [ID#0] HTTP: wrote 4 bytes
10/12/2023 2:26:42 PM |  | [http_xfer] [ID#0] HTTP: wrote 7 bytes
10/12/2023 2:26:42 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:26:42 PM |  | [http_xfer] [ID#0] HTTP: wrote 242 bytes
10/12/2023 2:26:42 PM |  | [http_xfer] [ID#0] HTTP: wrote 6 bytes
10/12/2023 2:26:42 PM |  | [http_xfer] [ID#0] HTTP: wrote 5953 bytes
10/12/2023 2:26:42 PM |  | Internet access OK - project servers may be temporarily down.
10/12/2023 2:27:03 PM | climateprediction.net | Temporarily failed upload of wah2_eas25_a3zq_201012_24_996_012228722_1_r612358962_1.zip: transient HTTP error
10/12/2023 2:27:03 PM | climateprediction.net | Backing off 05:09:52 on upload of wah2_eas25_a3zq_201012_24_996_012228722_1_r612358962_1.zip
10/12/2023 2:27:03 PM | climateprediction.net | Started upload of wah2_eas25_a0b7_198712_24_996_012223951_1_r1086088478_2.zip
10/12/2023 2:27:04 PM |  | Project communication failed: attempting access to reference site
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 9786 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 951 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 21 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 57 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 233 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 2 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 2 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 5418 bytes
10/12/2023 2:27:05 PM |  | [http_xfer] [ID#0] HTTP: wrote 2423 bytes
10/12/2023 2:27:05 PM |  | Internet access OK - project servers may be temporarily down.
10/12/2023 2:27:20 PM |  | [http_xfer] [ID#58185] HTTP: wrote 100 bytes
10/12/2023 2:27:23 PM |  | [http_xfer] [ID#58187] HTTP: wrote 64 bytes
10/12/2023 2:27:59 PM |  | Project communication failed: attempting access to reference site
10/12/2023 2:27:59 PM | climateprediction.net | Temporarily failed upload of wah2_eas25_a0b7_198712_24_996_012223951_1_r1086088478_2.zip: transient HTTP error
10/12/2023 2:27:59 PM | climateprediction.net | Backing off 05:02:51 on upload of wah2_eas25_a0b7_198712_24_996_012223951_1_r1086088478_2.zip
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 11600 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 185 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 6 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 1 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 4 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 2597 bytes
10/12/2023 2:27:59 PM |  | [http_xfer] [ID#0] HTTP: wrote 4388 bytes
10/12/2023 2:28:00 PM |  | Internet access OK - project servers may be temporarily down.
10/12/2023 2:28:05 PM |  | [http_xfer] [ID#1] HTTP: wrote 2256 bytes
10/12/2023 2:28:12 PM |  | [http_xfer] [ID#1] HTTP: wrote 7186 bytes
10/12/2023 2:28:15 PM |  | [http_xfer] [ID#58189] HTTP: wrote 1022 bytes
ID: 69791 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 943
Credit: 34,330,545
RAC: 11,301
Message 69792 - Posted: 12 Oct 2023, 20:47:26 UTC - in response to Message 69791.  

You've chosen "http_xfer_debug". In this sort of case, "http_debug" (in the other column) is more use.
ID: 69792 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4348
Credit: 16,541,921
RAC: 6,087
Message 69793 - Posted: 12 Oct 2023, 20:48:30 UTC
Last modified: 12 Oct 2023, 20:49:31 UTC

Next time you hit the retry button for a file transfer try http_debug rather than http_transfer. It's the one that contains the information Glen, Andy, Richard and others who have more of a handle on what the messages mean than I do need.
Edit: beaten to it.
ID: 69793 · Report as offensive     Reply Quote
ChelseaOilman

Send message
Joined: 24 Dec 19
Posts: 28
Credit: 27,767,688
RAC: 220,325
Message 69794 - Posted: 12 Oct 2023, 21:45:39 UTC

10/12/2023 3:30:43 PM |  | [http] [ID#0] Sent header to server:         <is_u
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: HTTP/1.1 200 OK
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Date: Thu, 12 Oct 2023 21:30:44 GMT
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Expires: -1
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Cache-Control: private, max-age=0
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Content-Type: text/html; charset=ISO-8859-1
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Content-Security-Policy-Report-Only: object-src 'none';base-uri 'self';script-src 'nonce-mVXWhcxa138DAd8BMiarhA' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/other-hp
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Content-Encoding: gzip
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Server: gws
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: X-XSS-Protection: 0
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: X-Frame-Options: SAMEORIGIN
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Set-Cookie: 1P_JAR=2023-10-12-21; expires=Sat, 11-Nov-2023 21:30:44 GMT; path=/; domain=.google.com; Secure
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Set-Cookie: AEC=Ackid1TqNpNlxVlCUAxmj3ycHxnWU_gcnx8Dl0gM1uzxiyUaSnjGtjl0nCA; expires=Tue, 09-Apr-2024 21:30:44 GMT; path=/; domain=.google.com; Secure; HttpOnly; SameSite=lax
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Set-Cookie: NID=511=rIJVDud9zw1mmc91P8-7UtMb59i1aEngtLMfKFeu6K6pV9NCetmMEPi_-Ew7smU6f3sKpQ-NWFGWdM_eNMeATxkIiI9pYpB1KuaIJLPW5gGEJzmuEntomrmyJl7L86C5N4zOyws38t23ZCKYokM8U6ZbqNLF_s30NCkWtIqnbfs; expires=Fri, 12-Apr-2024 21:30:44 GMT; path=/; domain=.google.com; HttpOnly
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: Transfer-Encoding: chunked
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server:
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: 00000001
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: 
10/12/2023 3:30:44 PM |  | [http] [ID#0] Received header from server: 00012fd
10/12/2023 3:30:44 PM |  | 
10/12/2023 3:30:44 PM |  | [http] [ID#0] Info:  Connection #39649 to host www.google.com left intact
10/12/2023 3:30:44 PM |  | Internet access OK - project servers may be temporarily down.
10/12/2023 3:31:04 PM | climateprediction.net | [http] [ID#58336] Info:  connect to 141.223.16.156 port 80 failed: Timed out
10/12/2023 3:31:04 PM | climateprediction.net | [http] [ID#58336] Info:  Failed to connect to upload7.cpdn.org port 80 after 21685 ms: Couldn't connect to server
10/12/2023 3:31:04 PM | climateprediction.net | [http] [ID#58336] Info:  Closing connection
10/12/2023 3:31:04 PM | climateprediction.net | [http] HTTP error: Timeout was reached
10/12/2023 3:31:04 PM | climateprediction.net | Temporarily failed upload of wah2_eas25_a1m2_199512_24_996_012225638_1_r224031392_10.zip: transient HTTP error
10/12/2023 3:31:04 PM | climateprediction.net | Backing off 04:55:47 on upload of wah2_eas25_a1m2_199512_24_996_012225638_1_r224031392_10.zip
10/12/2023 3:31:05 PM |  | Project communication failed: attempting access to reference site
10/12/2023 3:31:05 PM |  | [http] HTTP_OP::init_get(): https://www.google.com/
10/12/2023 3:31:05 PM |  | [http] [ID#0] Info:  processing: https://www.google.com/
10/12/2023 3:31:05 PM |  | [http] [ID#0] Info:  Found bundle for host: 0x21852d434c0 [serially]
10/12/2023 3:31:05 PM |  | [http] [ID#0] Info:  Re-using existing connection with host www.google.com
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server: GET / HTTP/1.1
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server: Host: www.google.com
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.24.1)
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server: Accept: */*
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server: Accept-Encoding: deflate, gzip
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server: Accept-Language: en_US
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server: roject_name>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <name>wah2_eas25_a04e_198512_24_996_012223706_0_r1503549291_1.zip</name>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <nbytes>98791421.000000</nbytes>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <max_nbytes>150000000.000000</max_nbytes>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <status>1</status>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <persistent_file_xfer>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <num_retries>56</num_retries>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <first_request_time>1696597000.236715</first_request_time>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <next_request_time>1697165043.758507</next_request_time>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <time_so_far>1828.456284</time_so_far>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <last_bytes_xferred>40894464.000000</last_bytes_xferred>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <is_upload>1</is_upload>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     </persistent_file_xfer>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <project_backoff>514.642508</project_backoff>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server: </file_transfer>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server: <file_transfer>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <project_url>https://climateprediction.net/</project_url>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <project_name>climateprediction.net</project_name>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <name>wah2_eas25_a1k4_199512_24_996_012225568_1_r601938269_13.zip</name>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <nbytes>98825282.000000</nbytes>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <max_nbytes>150000000.000000</max_nbytes>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <status>1</status>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <persistent_file_xfer>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <num_retries>2</num_retries>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <first_request_time>1697114921.544970</first_request_time>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <next_request_time>0.000000</next_request_time>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <time_so_far>403.921627</time_so_far>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <last_bytes_xferred>0.000000</last_bytes_xferred>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <is_upload>1</is_upload>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     </persistent_file_xfer>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <project_backoff>514.642508</project_backoff>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server: </file_transfer>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server: <file_transfer>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <project_url>https://climateprediction.net/</project_url>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <project_name>climateprediction.net</project_name>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <name>wah2_eas25_a3ru_200912_24_996_012228438_1_r2058761054_13.zip</name>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <nbytes>99009152.000000</nbytes>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <max_nbytes>150000000.000000</max_nbytes>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <status>1</status>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:     <persistent_file_xfer>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <num_retries>3</num_retries>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <first_request_time>1697113002.941149</first_request_time>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <next_request_time>0.000000</next_request_time>
10/12/2023 3:31:05 PM |  | [http] [ID#0] Sent header to server:         <time_so
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: HTTP/1.1 200 OK
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Date: Thu, 12 Oct 2023 21:31:05 GMT
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Expires: -1
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Cache-Control: private, max-age=0
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Content-Type: text/html; charset=ISO-8859-1
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Content-Security-Policy-Report-Only: object-src 'none';base-uri 'self';script-src 'nonce-EKCiHkxcjqQs1TZsEgvE-Q' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/other-hp
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Content-Encoding: gzip
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Server: gws
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: X-XSS-Protection: 0
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: X-Frame-Options: SAMEORIGIN
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Set-Cookie: 1P_JAR=2023-10-12-21; expires=Sat, 11-Nov-2023 21:31:05 GMT; path=/; domain=.google.com; Secure
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Set-Cookie: AEC=Ackid1R0hcl7yq_l_x9GUrotXEsa7q9VzzEkbzK385BlHd59ZZ1F8NFXiTc; expires=Tue, 09-Apr-2024 21:31:05 GMT; path=/; domain=.google.com; Secure; HttpOnly; SameSite=lax
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Set-Cookie: NID=511=ml1gEIyxQdM_oZyFhbSeb1iey7YmsnL3bKZ4PI1U954JslnN8wc9DfSeRxCFQmYFl71voKiKx6I3O8eLQDqjEy9d88xeYgXIzJtSqqXemSFmaDujevHBdYkUbNONO2dHAphiiMj9ZU9QUQqaHfUGdeE1jsdu_QQTRE2jeddoZNI; expires=Fri, 12-Apr-2024 21:31:05 GMT; path=/; domain=.google.com; HttpOnly
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: Transfer-Encoding: chunked
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server:
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: 00000001
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: 
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: 00000001
10/12/2023 3:31:05 PM |  | 
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: 00000001
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: 
10/12/2023 3:31:05 PM |  | [http] [ID#0] Received header from server: 00000001
10/12/2023 3:31:05 PM |  | [http] [ID#0] Info:  Connection #39649 to host www.google.com left intact
10/12/2023 3:31:06 PM |  | Internet access OK - project servers may be temporarily down.
ID: 69794 · Report as offensive     Reply Quote
ChelseaOilman

Send message
Joined: 24 Dec 19
Posts: 28
Credit: 27,767,688
RAC: 220,325
Message 69795 - Posted: 13 Oct 2023, 5:22:35 UTC

This computer is now up to 27 tasks waiting to upload. A few got to 90%+ before crapping out. This seems to be an issue more for the computers with a lot of tasks running. Computers with only a few tasks running don't seem to have as much of a problem uploading.
ID: 69795 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4348
Credit: 16,541,921
RAC: 6,087
Message 69797 - Posted: 13 Oct 2023, 6:45:46 UTC

Computers with only a few tasks running don't seem to have as much of a problem uploading.

Be interesting to hear from a few others on that. It doesn't really make much sense to me if max_file_transfers isn't set any higher than the default. I normally restrict mine to 7 tasks at a time to leave one real core free for non-BOINC activity. Using virtual cores on CPDN I don't get any increase in throughput. I do currently have nine running as I have put Windows into a VM as Glen doesn't trust WINE under BOINC for running these tasks. So I now have 7 under WINE and two in the VM. The maximum concurrent uploads therefore is 4 with both instances of BOINC set to a maximum of 2.

My boredband maxes out at just over 100Kb/second upload speed on a good day with a following wind. When I have very large uploads, I sometimes use the 4G on my phone to get between five and ten times that. As an experiment, could you try limiting upload speed to see if that gets any more through?
ID: 69797 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4348
Credit: 16,541,921
RAC: 6,087
Message 69799 - Posted: 13 Oct 2023, 7:48:33 UTC

So far 19 tasks completed successfully and reported. (up from 8 this time yesterday.) If there were not a major problem with uploading I would have expected more as progressively slower machines start completing tasks.
ID: 69799 · Report as offensive     Reply Quote
Tomcat

Send message
Joined: 29 May 15
Posts: 17
Credit: 378,867
RAC: 3,044
Message 69800 - Posted: 13 Oct 2023, 8:27:00 UTC
Last modified: 13 Oct 2023, 8:33:05 UTC

I've just had a batch 996 task error out due to "196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED".

What could cause this to happen? There is no way the disk is getting full since there is at least 800GB available to BOINC. I don't think the computer ever restarted (caught and rescheduled a Windows update).
ID: 69800 · Report as offensive     Reply Quote
rob

Send message
Joined: 5 Jun 09
Posts: 79
Credit: 3,038,562
RAC: 4,077
Message 69801 - Posted: 13 Oct 2023, 8:34:20 UTC - in response to Message 69797.  

Well....
I'm running two or three tasks, and for the last day or so have been living with at least one upload stuck in the loop.
ID: 69801 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4348
Credit: 16,541,921
RAC: 6,087
Message 69802 - Posted: 13 Oct 2023, 8:45:08 UTC - in response to Message 69800.  
Last modified: 13 Oct 2023, 9:04:59 UTC

I've just had a batch 996 task error out due to "196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED".
Did you have a lot of zips queued there is an entry that the task sets in one of the XML files that sets a limit for how much space the task can take up on the disk. I have manually edited this in the past with some tasks due to my very slow connection. I will edit this post once I have reminded myself of the entry and which file it is.

Edit:File is client_state.xml entry is
<rsc_disk_bound>2000000000.000000</rsc_disk_bound> You will have an entry like that under the section for each task that is running. If the total of working files plus zips for the relevant task waiting to go exceeds that the task errors out. I have increased this in the past using a text editor. If you do try this, the usual caveats apply about it being at your own risk and making backups. (do as I say not as I do!) It was on OIFS tasks I think I last did it and I increased the size by a factor of 10. The danger is that you need to close down BOINC to do this and may lose the tasks anyway as a result of doing so.

Edit2: Looking at the task in question I saw this on the task page.
Peak disk usage 1,998.41 MB

That is only just below the limit. One of the tasks that completed successfully had,
Peak disk usage 199.63 MB
If it was not zips waiting to upload, then something else has gone wrong with the task causing it to use so much disk space.
ID: 69802 · Report as offensive     Reply Quote
Tomcat

Send message
Joined: 29 May 15
Posts: 17
Credit: 378,867
RAC: 3,044
Message 69803 - Posted: 13 Oct 2023, 9:19:13 UTC - in response to Message 69802.  
Last modified: 13 Oct 2023, 9:29:49 UTC

I've just had a batch 996 task error out due to "196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED".
Did you have a lot of zips queued there is an entry that the task sets in one of the XML files that sets a limit for how much space the task can take up on the disk. I have manually edited this in the past with some tasks due to my very slow connection. I will edit this post once I have reminded myself of the entry and which file it is.

Edit:File is client_state.xml entry is
<rsc_disk_bound>2000000000.000000</rsc_disk_bound> You will have an entry like that under the section for each task that is running. If the total of working files plus zips for the relevant task waiting to go exceeds that the task errors out. I have increased this in the past using a text editor. If you do try this, the usual caveats apply about it being at your own risk and making backups. (do as I say not as I do!) It was on OIFS tasks I think I last did it and I increased the size by a factor of 10. The danger is that you need to close down BOINC to do this and may lose the tasks anyway as a result of doing so.

Edit2: Looking at the task in question I saw this on the task page.
Peak disk usage 1,998.41 MB

That is only just below the limit. One of the tasks that completed successfully had,
Peak disk usage 199.63 MB
If it was not zips waiting to upload, then something else has gone wrong with the task causing it to use so much disk space.

Oh dear. I just checked and I think ALL of the trickle zips are still uploading. I'll suspend the CPDN tasks until that gets sorted out. I still got the credits for all the trickles that refuse to upload. One file is stuck at 94.46 percent.

What could cause the uploads to constantly fail? My other projects work fine so it's probably not my Internet.

2023-10-13 5:22:52 AM | climateprediction.net | Started upload of wah2_eas25_a0uu_199012_24_996_012224658_2_r1702623904_7.zip
2023-10-13 5:22:52 AM | climateprediction.net | Started upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_13.zip
2023-10-13 5:23:15 AM | climateprediction.net | Temporarily failed upload of wah2_eas25_a0uu_199012_24_996_012224658_2_r1702623904_7.zip: transient HTTP error
2023-10-13 5:23:15 AM | climateprediction.net | Backing off 00:10:19 on upload of wah2_eas25_a0uu_199012_24_996_012224658_2_r1702623904_7.zip
2023-10-13 5:23:15 AM | climateprediction.net | Started upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_12.zip
2023-10-13 5:23:15 AM | climateprediction.net | Temporarily failed upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_13.zip: transient HTTP error
2023-10-13 5:23:15 AM | climateprediction.net | Backing off 00:06:13 on upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_13.zip
2023-10-13 5:23:15 AM | climateprediction.net | Started upload of wah2_eas25_a2c6_200012_24_996_012226578_1_r43387403_5.zip
2023-10-13 5:23:16 AM |  | Project communication failed: attempting access to reference site
2023-10-13 5:23:17 AM |  | Internet access OK - project servers may be temporarily down.
2023-10-13 5:23:37 AM | climateprediction.net | Temporarily failed upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_12.zip: transient HTTP error
2023-10-13 5:23:37 AM | climateprediction.net | Backing off 00:07:03 on upload of wah2_eas25_a0nm_198912_24_996_012224398_0_r372338944_12.zip
2023-10-13 5:23:37 AM | climateprediction.net | Temporarily failed upload of wah2_eas25_a2c6_200012_24_996_012226578_1_r43387403_5.zip: transient HTTP error
2023-10-13 5:23:37 AM | climateprediction.net | Backing off 00:07:59 on upload of wah2_eas25_a2c6_200012_24_996_012226578_1_r43387403_5.zip
2023-10-13 5:23:38 AM |  | Project communication failed: attempting access to reference site
2023-10-13 5:23:39 AM |  | Internet access OK - project servers may be temporarily down.
ID: 69803 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 812
Credit: 13,623,404
RAC: 6,372
Message 69804 - Posted: 13 Oct 2023, 10:10:28 UTC - in response to Message 69800.  

I've just had a batch 996 task error out due to "196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED".

What could cause this to happen? There is no way the disk is getting full since there is at least 800GB available to BOINC. I don't think the computer ever restarted (caught and rescheduled a Windows update).
That's useful to know, thanks for reporting it. I'll raise this in the CPDN Technical meeting on Monday. We can raise the limit to allow for transfers still waiting to go. As Dave says, it's a limit set by CPDN on how much disk the task is expected to use. It's normally set with a decent margin but I suspect it hasn't been changed for some time and the regions for WAH2 have got bigger.

Regarding transfers, Andy is looking at the Korean server but doesn't have full access so communicating with the Korean IT which will take a while.
---
CPDN Visiting Scientist
ID: 69804 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 12 · Next

Message boards : Number crunching : Batch 996 Weather@Home2 East Asia25

©2024 climateprediction.net