climateprediction.net home page
The uploads are stuck

The uploads are stuck

Message boards : Number crunching : The uploads are stuck
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 25 · Next

AuthorMessage
PaoloNasca

Send message
Joined: 7 Jul 19
Posts: 1
Credit: 1,100,133
RAC: 515
Message 67040 - Posted: 25 Dec 2022, 8:03:24 UTC

Dear web master and project admin,
The uploads are stuck

Project communication failed: attempting access to reference site
Temporarily failed upload of oifs_.......zip: transient HTTP error
Backing off 00:02:15 on upload of oifs_......zip
Internet access OK - project servers may be temporarily down.

Paolo
ID: 67040 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 98
Credit: 35,000,682
RAC: 107,066
Message 67042 - Posted: 25 Dec 2022, 8:43:56 UTC - in response to Message 67040.  

Same and the discussion is already happening in the OpenIFS thread right below. Starting here: https://www.cpdn.org/forum_thread.php?id=9162&postid=67029#67029. Honestly I wouldn't mind if this isn't fixed until next year. It's holiday season after all.
ID: 67042 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1083
Credit: 16,671,999
RAC: 5,277
Message 67043 - Posted: 25 Dec 2022, 9:57:42 UTC - in response to Message 67042.  

Honestly I wouldn't mind if this isn't fixed until next year. It's holiday season after all.


I very much mind. I have over 500 "trickles" to upload. Each is about 14 Megabytes in size. Since I have 5 Oifs tasks running at a time, they produce about five of these trickles every six minutes. They gotta be stored somewhere. Fortunately for me, I have all my Boinc stuff in a dedicated disk partition that is about 500 GBytes in size, so I could survive until the new year, but by then there will be an awful lot of uploads to send at the same time. I hope their server can take this much. I hope these tasks do not get rejected if they expire before the uploads are sent., And I am not the only one.,
ID: 67043 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 98
Credit: 35,000,682
RAC: 107,066
Message 67045 - Posted: 25 Dec 2022, 20:10:17 UTC - in response to Message 67043.  

Honestly I wouldn't mind if this isn't fixed until next year. It's holiday season after all.


I very much mind. I have over 500 "trickles" to upload. Each is about 14 Megabytes in size. Since I have 5 Oifs tasks running at a time, they produce about five of these trickles every six minutes. They gotta be stored somewhere. Fortunately for me, I have all my Boinc stuff in a dedicated disk partition that is about 500 GBytes in size, so I could survive until the new year, but by then there will be an awful lot of uploads to send at the same time. I hope their server can take this much. I hope these tasks do not get rejected if they expire before the uploads are sent., And I am not the only one.,

We can just switch to some other projects in the meantime? I doubt any boinc projects are run by multiple members like a real oncall rotation. People need to take breaks.

The server or its link will probably get overloaded once it starts accepting uploads again. People on limited upload bandwidth better have good traffic shaping policies or the constant saturated upload can easily render their Internet unusable. The disk usage is a good reminder that can actually affect other projects. If CPDN keeps sending WUs but never allow any upload, it's possible the disk usage would reach boinc's max disk usage and thus stop all projects...

PS: The pending uploads are stored in projects/climateprediction.net under data directory if you want to monitor its current size.
ID: 67045 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1083
Credit: 16,671,999
RAC: 5,277
Message 67046 - Posted: 25 Dec 2022, 20:55:37 UTC - in response to Message 67045.  

We can just switch to some other projects in the meantime? I doubt any boinc projects are run by multiple members like a real oncall rotation. People need to take breaks.


Sure we can.
I also run WCG which sends out next to no work and was down for 8 months or so lately.
I also run Rosetta that is sending out no work.
The following do send out new work, but ...
I also run Einstein, but it is not very important to me.
I also run MilkyWay, but it is not important to me.
I also run Universe, but it is not important to me.
ID: 67046 · Report as offensive     Reply Quote
Scottie Mckinley

Send message
Joined: 13 Feb 22
Posts: 1
Credit: 3,309,784
RAC: 2,221
Message 67047 - Posted: 25 Dec 2022, 21:24:54 UTC

So for Xmas, someone turned off a server or something. In the last week, I wiped W11 off my PC. Installed Linux, having no idea how to use it. Learned enough to get Boinc working in Linux and now this. Like you, I have multiple files waiting. What a week it's been.
ID: 67047 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 98
Credit: 35,000,682
RAC: 107,066
Message 67048 - Posted: 26 Dec 2022, 0:58:02 UTC - in response to Message 67046.  

I know and I feel you. The non-math projects have been dwindling over the years. WCG used to cover a whole lot more but these days are just two medical projects with ARP occasionally trickling in. The migration off IBM certainly didn't go well. The projects I added in recent years (asteroid, universe, LHC) are all because at some point, all projects I contributed to run out of work. Among the long list of math projects, I have yet found anything I can remotely relate to. In addition, for winter, I'd rather run my computers than turning on the heater.

Still though, BOINC or any projects are generally not run as a high availability service. That requires a level of funding and expertise that are generally not available to researchers and that's also a very different focus compared to science research. Sure we contribute compute power at our own cost, but I personally don't consider that enough to justify expecting people to troubleshoot during holidays.
ID: 67048 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4365
Credit: 16,614,059
RAC: 1,518
Message 67050 - Posted: 26 Dec 2022, 9:52:44 UTC

For me, the main problem is I will have to stop computing long enough for the uploads to clear. Even if I run flat out with all cores (I am just running on two because otherwise my bored band can't keep up) I won't run out of disk space before the end of January. Another possible issue is that tasks may go over disk_bound I think it is I might go into the files and increase that if it looks like it might cause problems. I had thought to check if trickles were still working but while they have been working as evidenced by tasks getting credit, even those with credit are not showing the trickles on task pages. - This was working on the testing site.
ID: 67050 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1083
Credit: 16,671,999
RAC: 5,277
Message 67052 - Posted: 26 Dec 2022, 12:35:16 UTC - in response to Message 67050.  

For me, the main problem is I will have to stop computing long enough for the uploads to clear. Even if I run flat out with all cores (I am just running on two because otherwise my bored band can't keep up) I won't run out of disk space before the end of January.


I doubt I will ever run out of disk space because I have about 420 GBytes of space in the partition dedicated to BOINC. I have been running 5 Oifs tasks at a time, but I have now stopped running any more of these. Currently, CPDN is using 21 GBytes of disk space, way more than usual. There are 14 completed tasks trying to upload their stuff.

I have a very fast Internet connection (about 75 Megabits/second, but sometimes it goes a little faster) and with the new server I get rates over half that maximum to CPDN. Until it went to 0, of course. It is now Boxing Day here, so I hope someone can get things working today and I will not need to wait until January 2 or 3.
As of right now, CPDN thinks my Internet speeds are these:
Average upload rate 	116.16 KB/sec
Average download rate 	10834.89 KB/sec

The upload speeds are as low as that now because I cannot upload anything. The download rate is a little higher than usual. I wonder what it has been downloading since CPDN is set to get no new tasks.
ID: 67052 · Report as offensive     Reply Quote
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 52,932,477
RAC: 8,823
Message 67064 - Posted: 28 Dec 2022, 5:38:25 UTC
Last modified: 28 Dec 2022, 6:04:31 UTC

We hit the (BOINC) 100GB limit over 24 hours ago on our 256t server, basically as soon as the uploads stopped working. Prior to this we hit the limit on this server at around 30% load but this was kept in check by the constant uploading.

Right now we can't download any new work because of the limit, both at CPDN or any other BONC project. And we can't shut down the server incase tasks fail when restarted after the reboot. Hopefully the transfer issue will get resolved sooner rather than later.

One of the BOINC Developers recently mentioned that this limit will be removed/fixed in the next BOINC Server release, but that obviously each project will have to update. Is CPDN likely to do this shortly after the new Release is available?
ID: 67064 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1083
Credit: 16,671,999
RAC: 5,277
Message 67065 - Posted: 28 Dec 2022, 7:02:20 UTC - in response to Message 67064.  

Right now we can't download any new work because of the limit, both at CPDN or any other BONC project. And we can't shut down the server incase tasks fail when restarted after the reboot. Hopefully the transfer issue will get resolved sooner rather than later.

One of the BOINC Developers recently mentioned that this limit will be removed/fixed in the next BOINC Server release, but that obviously each project will have to update. Is CPDN likely to do this shortly after the new Release is available?


I am confused. Are you running a BOINC Server? Most of us users run only a Boinc Client that picks up tasks from a Boinc Server, usually at a central location (such as, perhaps, Oxford England). I hope it is impossible for us, the users, to shut down a server because we might to it by accident, and we certainly outnumber the number of system administrators at the server locations, so we could sure make a mess of things.

Right now I have downloads from CPDN disabled since I cannot upload any results and have about 30 GBytes of files to upload. I do not know about any limit to the number of files to upload other than the amount of disk space I have to keep them, and I have room for over 350 GBytes of that, not counting what I am already using. I have my last three tasks running and then I am done until my existing files start uploading.

I CAN download tasks for other Boinc projects, when they are available. The only one I cannot download from is Rosetta, and the reason for that is they do not have any tasks to supply at the moment.
ID: 67065 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 12,074,507
RAC: 2,078
Message 67066 - Posted: 28 Dec 2022, 8:34:50 UTC - in response to Message 67064.  
Last modified: 28 Dec 2022, 8:44:54 UTC

ncoded.com,
Something doesn't sound right. If you're processing tasks you must be using BOINC client, even if without the BOINC manager. I do not believe there's any kind of disk usage limit set by BOINC or projects. What's probably happening is that your BOINC client is set to 100GB limit by you or maybe it was the default that you just left as is. That setting is adjustable. If you have BOINC manager, go to Options menu --> Computer Preferences -->Disk and memory and you'll see 3 options of how to set disk limits. If you don't have BOINC manager and just running the client, go to /etc/boinc-client directory and edit global_prefs_override.xml file. The 3 options you're looking for are below with explanations. If they're not there just add one or all of them.
   <disk_max_used_gb>175.000000</disk_max_used_gb>  Sets the limit in GB that BOINC is allowed to use
   <disk_max_used_pct>0.000000</disk_max_used_pct>  Sets the limit as % of total disk space that BOINC can use
   <disk_min_free_gb>0.000000</disk_min_free_gb>  Tells BOINC to use as much disk space as it needs but to make sure to leave at least the set amount of GB free

You can change any of those settings and BOINC will use the most restrictive one. I for example just use the first setting and set a limit in GB of space BOINC can use. If you're using BOINC manager, once you save the changes they'll take effect immediately. If you only have the client, after making the changes in that file, run the boinccmd command for changes to take effect. If you have a simple, default set up the command would be:
boinccmd --read_global_prefs_override

You can also change these preferences on the CPDN website by going to the Computing preferences link under Preferences section on your main account web page. However, the local global_prefs_override.xml file will override any website settings. Also, the website settings will apply to all PCs you have attached to CPDN unless overriden by that file. If you use the website make sure to Update the project from the manager or by running the appropriate boinccmd command or just wait until the client contacts CPDN on its own again.

For more info on BOINC preferences and entries of the global_prefs_override.xml file check out https://boinc.berkeley.edu/wiki/Preferences and https://boinc.berkeley.edu/wiki/PreferencesXml.
ID: 67066 · Report as offensive     Reply Quote
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 52,932,477
RAC: 8,823
Message 67067 - Posted: 28 Dec 2022, 8:47:30 UTC
Last modified: 28 Dec 2022, 9:07:51 UTC

We are just running normal BOINC. By server I mean the BOINC Server that the Projects run.

The 100 GB limit in BOINC was confirmed by one of the Developers.

Unless it says in the "Disk Tab" that you are at/near 100GB then I guess you probably won't see this error.

When we get near to 100GB usage we then get a BOINC notification saying that we have run out of Disk space. Yet when we check in BOINC we can see it has over 1TB free and available to itself. This was explained to us a 100GB limit within BOINC.
ID: 67067 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 12,074,507
RAC: 2,078
Message 67068 - Posted: 28 Dec 2022, 8:54:50 UTC

I run all of my Linux BOINC computing on Windows10 WSL2 Ubuntu setup. The default partition size of WSL2 is 250GB and it's over 160GB full with BOINC and OS files. Hopefully the uploads get going before 50GB more of files get generated otherwise I'd probably have to stop CPDN computing so as to not run out of space. I believe the partition can be resized but I've never tried it before and wouldn't want to risk it and loose all of the work if something goes wrong.
ID: 67068 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 12,074,507
RAC: 2,078
Message 67069 - Posted: 28 Dec 2022, 9:30:44 UTC - in response to Message 67067.  

ncoded.com,
Something doesn't make sense. I've changed my settings in the manager as described above a couple of times now, it's at 175GB currently. The Disk tab in manager shows over 130GB used by BOINC. I've never got any warnings but that's probably because I changed the settings before the warning threshold was reached.

On the website your computer shows "BOINC Version 7.20.2" which is a valid version for client/manager. I believe BOINC Server version numbering is different, 1.4 being the latest I think. It seems to me like you are using BOINC client and probably with manager. Have you tried playing with the settings as described above? Unless I'm missing something, I don't see a reason that I'd be able to change mine but you'd be stuck at 100GB. I also am not sure of the limit that developer is referring to. Maybe it's something new in version 7.20.2, I'm running 7.18.1 which is what's available as default from Ubuntu 22.04 package manager.
ID: 67069 · Report as offensive     Reply Quote
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 52,932,477
RAC: 8,823
Message 67070 - Posted: 28 Dec 2022, 9:42:11 UTC
Last modified: 28 Dec 2022, 9:49:08 UTC

Everytime we have had this issue, going back over years (using many different hosts) we have only seen it on CPDN. So perhaps you need near 100GB of CPDN (or perhaps a single project) to hit this issue?

To be honest I was hoping more that one of the Mod's would answer as they probably know more about this than others including myself.
ID: 67070 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4365
Credit: 16,614,059
RAC: 1,518
Message 67071 - Posted: 28 Dec 2022, 9:42:13 UTC
Last modified: 28 Dec 2022, 9:57:25 UTC

Also no problems going over 100GB here with 7.21.0 built from source from git-hub.

Edit: I say, "no problem" but it will keep my bored band occupied for a long time before it catches up.

Edit 2: I see there is a user running Rosetta who has posted on the BOINC forums with what looks like the same issue, again not resolved even though there are users for whom it doesn't appear to be a problem. see here
ID: 67071 · Report as offensive     Reply Quote
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 52,932,477
RAC: 8,823
Message 67075 - Posted: 28 Dec 2022, 10:05:46 UTC
Last modified: 28 Dec 2022, 10:40:18 UTC

Yeah different people are saying different things, its difficult to know which is correct. But like I mentioned I have seen the same issue, across different hosts over the years all on CPDN, so that would lead me to think the issue is within BOINC (rather the OS etc).

I just updated BOINC and restarted Linux so I'll try and download more CPDN tasks and see if I get the same error notification.
ID: 67075 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4365
Credit: 16,614,059
RAC: 1,518
Message 67076 - Posted: 28 Dec 2022, 10:09:20 UTC - in response to Message 67072.  

And it isn't going to be reproduced very often. Most who have slow connections will have set CPDN to <no new tasks> I actually have a bit to go before CPDN goes over the 100GB limit but if Andy doesn't sort things out in the next day or two I will get there.
ID: 67076 · Report as offensive     Reply Quote
ncoded.com

Send message
Joined: 16 Aug 16
Posts: 73
Credit: 52,932,477
RAC: 8,823
Message 67077 - Posted: 28 Dec 2022, 10:23:50 UTC
Last modified: 28 Dec 2022, 10:35:55 UTC

If you do get to 100GB on CPDN could you let me know if you hit any limit or not?

That would be really useful (as you have a later compiled version).
ID: 67077 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 25 · Next

Message boards : Number crunching : The uploads are stuck

©2024 climateprediction.net