climateprediction.net home page
The uploads are stuck

The uploads are stuck

Message boards : Number crunching : The uploads are stuck
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · 22 · 23 . . . 25 · Next

AuthorMessage
Yeti
Avatar

Send message
Joined: 5 Aug 04
Posts: 171
Credit: 10,266,501
RAC: 29,602
Message 67951 - Posted: 21 Jan 2023, 21:44:05 UTC - in response to Message 67937.  
Last modified: 21 Jan 2023, 21:44:41 UTC

Here we go again:
21 Jan 2023 17:43 UTC Error reported by file upload server: can't write file oifs.....zip: No space left on server
Same here, we seem to upload faster than the internal processess move files to other places.

occasional uploads go through


Supporting BOINC, a great concept !
ID: 67951 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4345
Credit: 16,533,637
RAC: 5,933
Message 67952 - Posted: 21 Jan 2023, 21:45:11 UTC

I have messaged Andy.
ID: 67952 · Report as offensive     Reply Quote
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 257
Credit: 31,932,017
RAC: 38,282
Message 67956 - Posted: 21 Jan 2023, 22:25:55 UTC - in response to Message 67947.  

Curious what's your $ per WU. I've also recently checked EC2, GCP or Azure and they all have that nice catch of bandwidth cost. Their bandwidth costs around $0.08-0.1 per GB and that would mean around $0.15 - $0.2 per WU. That alone already exceeds cost per WU for whatever I can get with my own equipment, electricity and home network. Azure covers first 100GB and others' free usage is negligible.


I've got a dual core EPYC VM running with 10GB RAM at $11.63/mo. It's running about 20h per task, with two going at any given time: https://www.cpdn.org/results.php?hostid=1538282

So, ballpark 70 WU/month, or $0.17/WU in compute costs, plus, as you note, bandwidth. Probably $0.25/WU. It's certainly more expensive than I manage at home, but I'm also upload bandwidth limited here, so I can't actually run all my systems. It's just something I like to mess with every now and then.
ID: 67956 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 87
Credit: 32,981,759
RAC: 14,695
Message 67957 - Posted: 21 Jan 2023, 23:30:44 UTC - in response to Message 67956.  

I've got a dual core EPYC VM running with 10GB RAM at $11.63/mo. It's running about 20h per task, with two going at any given time: https://www.cpdn.org/results.php?hostid=1538282

So, ballpark 70 WU/month, or $0.17/WU in compute costs, plus, as you note, bandwidth. Probably $0.25/WU. It's certainly more expensive than I manage at home, but I'm also upload bandwidth limited here, so I can't actually run all my systems. It's just something I like to mess with every now and then.

Thanks. That's around similar number I come up with and all three generally fall into similar $0.4-$0.5/WU range with their cheapest instance types. They are all pretty competitive against each other, but far more expensive than my own setup. Guess this should be totally expected given their machines are loaded with all other cool stuff I don't need plus better network, uptime, etc and they still need to make money. My upload link isn't great either, but enough for now. Hopefully next versions of OpenIFS would have higher compute to bandwidth ratio to make it easier.
ID: 67957 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4345
Credit: 16,533,637
RAC: 5,933
Message 67959 - Posted: 22 Jan 2023, 8:29:36 UTC

From Andy,

Hi Dave,

Thanks. I have throttled down the number of incoming connections. This will give a chance for the rsync process to catch up.

Best wishes,

Andy
ID: 67959 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 943
Credit: 34,276,661
RAC: 11,053
Message 67961 - Posted: 22 Jan 2023, 8:45:20 UTC - in response to Message 67959.  
Last modified: 22 Jan 2023, 9:03:11 UTC

Saw Andy's message, timed at 21:55 last night. It doesn't seem to have made much difference - I'm still in multi-hour project backoffs. I'll check exactly how many uploads are getting through when I've woken up a bit more.

On another subject, the generic Climate Prediction home page https://www.climateprediction.net/index.php is giving me an error today:

Your PHP installation appears to be missing the MySQL extension which is required by WordPress.
PHP and WordPtrss are server-side technologies, so I think it's their installation, rather than my installation. Probably an update went wrong. The cpdn.org/cpdnboinc/ pages are working fine.

Edit - one machine got a 5-minute burst of uploads around 22:45 last night, and another around 04:45 this morning, but nothing since then. (A total of 100 files across the two bursts)
ID: 67961 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4345
Credit: 16,533,637
RAC: 5,933
Message 67964 - Posted: 22 Jan 2023, 9:50:07 UTC - in response to Message 67961.  

I have just seen one zip go through for me in past hour. And I get the same as you on the wordpress thing.
ID: 67964 · Report as offensive     Reply Quote
xii5ku

Send message
Joined: 27 Mar 21
Posts: 79
Credit: 78,302,757
RAC: 1,077
Message 67965 - Posted: 22 Jan 2023, 9:55:02 UTC - in response to Message 67961.  
Last modified: 22 Jan 2023, 10:12:32 UTC

Richard Haselgrove wrote:
Saw Andy's message, timed at 21:55 last night. It doesn't seem to have made much difference - I'm still in multi-hour project backoffs. I'll check exactly how many uploads are getting through when I've woken up a bit more. [...] Edit - one machine got a 5-minute burst of uploads around 22:45 last night, and another around 04:45 this morning, but nothing since then. (A total of 100 files across the two bursts)
The situation turned from a certain portion of transfers failing (and going into retry loops), to a large portion of connection requests being rejected.

Ever since the upload server was revived, it is evidently working near or at its throughput limit and only the details of how it is coping are slightly varying over time. From the project's infrastructure point of view there is one good aspect of this: The upload infrastructure is well utilized (as long as it doesn't go down like on Christmas eve and during the first recovery attempt, or attempts, in early January). For us client operators it's of course bad because we have to constantly watch and possibly re-adjust the compute clients in order to prevent too large transfer backlogs or even outright task failures in case of lack of disk space. The client can deal with a situation like this somewhat on its own, but not particularly well.


SolarSyonyk wrote:
I've got a dual core EPYC VM running with 10GB RAM at $11.63/mo. It's running about 20h per task, with two going at any given time: https://www.cpdn.org/results.php?hostid=1538282
So either you will be lucky and the upload server availability recovers soon enough. Or you will need to go through hoops to add storage to the VM while it is up and running. Or you will have to suspend the unstarted tasks and wait for the running tasks to complete and then shut the VM down. Or you could shut down the VM right away and risk the tasks erroring out after resumption. Or you could suspend the VM at extra charge for the provider's storing your VM state.
ID: 67965 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 11,989,490
RAC: 23,624
Message 67966 - Posted: 22 Jan 2023, 10:13:33 UTC
Last modified: 22 Jan 2023, 10:35:09 UTC

So I just noticed a problem that I thought wasn't going to happen. Tasks are timing out (due to upload issues) and are being resent to new users. The 30 day grace period setting doesn't seem to be working, or at least not in the way I'd expect it to. Richard, for example, you have a bunch of tasks like that.
ID: 67966 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 943
Credit: 34,276,661
RAC: 11,053
Message 67968 - Posted: 22 Jan 2023, 10:30:24 UTC - in response to Message 67966.  

So I just noticed a problem that I thought wasn't going to happen. Tasks are timing out (due to upload issues) and are being resent to new users. The 30 day grace period setting doesn't seem to be working, or at least not in the way I'd expect it to. Richard, for example, you have a bunch of tasks like that.
Yes, I'm aware of those - they're the result of an educational experiment I carried out for Glenn, which went wrong for unexpected reasons. Those tasks are lost, and wouldn't have been returned even if there had been a grace period - I was going to suggest they should be resent immediately, so I'm glad to see that's happened.

But interestingly, the resent tasks have been given a two month deadline. Looks like the project considered the 'grace period' route, but decided to handle the problem in the traditional way instead.
ID: 67968 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 11,989,490
RAC: 23,624
Message 67969 - Posted: 22 Jan 2023, 10:39:13 UTC - in response to Message 67968.  

they're the result of an educational experiment I carried out for Glenn, which went wrong for unexpected reasons

it seems to be more than that as I got a couple of tasks as resends from 2 different users who have a bunch of timed out tasks. I did notice the 2 month deadline on newly downloaded tasks.
ID: 67969 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 943
Credit: 34,276,661
RAC: 11,053
Message 67970 - Posted: 22 Jan 2023, 11:03:50 UTC - in response to Message 67969.  

it seems to be more than that as I got a couple of tasks as resends from 2 different users who have a bunch of timed out tasks. I did notice the 2 month deadline on newly downloaded tasks.
Ah - had another look. I've downloaded a number of tasks since the new upload failure struck yesterday afternoon, and all of them (resends and initial _0 replications) show:

  • A 30-day deadline in my local BOINC Manager
  • A 60-day deadline on the website.

So it IS a grace period, but the BOINC server must only apply it to newly issued tasks after the configuration change - not to tasks already 'in the field'.

That's good - the shorter local deadline will prompt the local client to go into 'hurry-up' mode sooner, while there's still time to complete the task before a replacement is issued.

ID: 67970 · Report as offensive     Reply Quote
Profile Saenger
Avatar

Send message
Joined: 1 Nov 04
Posts: 185
Credit: 4,154,204
RAC: 1,528
Message 67971 - Posted: 22 Jan 2023, 11:35:11 UTC

I got this with the last .zip for one WU:
So 22 Jan 2023 12:27:41 CET | climateprediction.net | Started upload of oifs_43r3_ps_0943_2008050100_123_977_12194587_0_r748317941_122.zip
So 22 Jan 2023 12:27:42 CET | climateprediction.net | [http] [ID#8542] Info:  17 bytes stray data read before trying h2 connection
So 22 Jan 2023 12:27:42 CET | climateprediction.net | [http] [ID#8542] Info:  Hostname upload11.cpdn.org was found in DNS cache
So 22 Jan 2023 12:27:42 CET | climateprediction.net | [http] [ID#8542] Info:    Trying 192.171.169.187:80...
So 22 Jan 2023 12:27:42 CET | climateprediction.net | [http] [ID#8542] Info:  TCP_NODELAY set
So 22 Jan 2023 12:27:42 CET | climateprediction.net | [http] [ID#8542] Info:  connect to 192.171.169.187 port 80 failed: Connection refused
So 22 Jan 2023 12:27:42 CET | climateprediction.net | [http] [ID#8542] Info:  Failed to connect to upload11.cpdn.org port 80: Connection refused
So 22 Jan 2023 12:27:42 CET | climateprediction.net | [http] [ID#8542] Info:  Closing connection 4249
So 22 Jan 2023 12:27:42 CET | climateprediction.net | [http] HTTP error: Couldn't connect to server
So 22 Jan 2023 12:27:42 CET |  | Project communication failed: attempting access to reference site
So 22 Jan 2023 12:27:42 CET |  | [http] HTTP_OP::init_get(): https://www.google.com/
So 22 Jan 2023 12:27:42 CET | climateprediction.net | Temporarily failed upload of oifs_43r3_ps_0943_2008050100_123_977_12194587_0_r748317941_122.zip: connect() failed
So 22 Jan 2023 12:27:42 CET | climateprediction.net | Backing off 04:01:24 on upload of oifs_43r3_ps_0943_2008050100_123_977_12194587_0_r748317941_122.zip
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Info:  Found bundle for host www.google.com: 0x55c2db0a5570 [can multiplex]
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Info:  Re-using existing connection! (#4229) with host www.google.com
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Info:  Connected to www.google.com (142.250.181.196) port 443 (#4229)
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Info:  Using Stream ID: 5 (easy handle 0x55c2db407f10)
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Sent header to server: GET / HTTP/2
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Sent header to server: Host: www.google.com
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Sent header to server: user-agent: BOINC client (x86_64-pc-linux-gnu 7.20.5)
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Sent header to server: accept: */*
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Sent header to server: accept-encoding: deflate, gzip, br
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Sent header to server: accept-language: de_DE
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Sent header to server:
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: HTTP/2 200
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: date: Sun, 22 Jan 2023 11:27:43 GMT
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: expires: -1
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: cache-control: private, max-age=0
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: content-type: text/html; charset=ISO-8859-1
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: cross-origin-opener-policy-report-only: same-origin-allow-popups; report-to="gws"
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: report-to: {"group":"gws","max_age":2592000,"endpoints":[{"url":"https://csp.withgoogle.com/csp/report-to/gws/other"}]}
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: p3p: CP="This is not a P3P policy! See g.co/p3phelp for more info."
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: content-encoding: gzip
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: server: gws
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: content-length: 6646
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: x-xss-protection: 0
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: x-frame-options: SAMEORIGIN
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: set-cookie: SOCS=CAAaBgiA-bGeBg; expires=Wed, 21-Feb-2024 11:27:43 GMT; path=/; domain=.google.com; Secure; SameSite=lax
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: set-cookie: AEC=ARSKqsJ2tJfYwIj7BRqYZCzCjnL9TNokU0KMa-wLLyh3nOCeOM4qvp504A; expires=Fri, 21-Jul-2023 11:27:43 GMT; path=/; domain=.google.com; Secure; HttpOnly; SameSite=lax
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: set-cookie: __Secure-ENID=9.SE=EIkGuGkKLNZ5WKFdwpMeNVaclpvMIgOG-OaWak1yb9XtPFrIIK4_dt8Vp8tD7ooRuf7gyd5_8ydovUxE4FRoo40fM6BrbZZVjBD9t5nxmgoPH8vz98e04Z0EDsd-l37wrYtYCugw3LLFNaWKIDO4SAcE5mTGgJ4MYA5Tblb_s1A; expires=Thu, 22-Feb-2024 03:46:01 GMT; path=/; domain=.google.com; Secure; HttpOnly; SameSite=lax
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: set-cookie: CONSENT=PENDING+503; expires=Tue, 21-Jan-2025 11:27:43 GMT; path=/; domain=.google.com; Secure
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server: alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Received header from server:
So 22 Jan 2023 12:27:43 CET |  | [http_xfer] [ID#0] HTTP: wrote 2513 bytes
So 22 Jan 2023 12:27:43 CET |  | [http_xfer] [ID#0] HTTP: wrote 3141 bytes
So 22 Jan 2023 12:27:43 CET |  | [http_xfer] [ID#0] HTTP: wrote 3320 bytes
So 22 Jan 2023 12:27:43 CET |  | [http_xfer] [ID#0] HTTP: wrote 3471 bytes
So 22 Jan 2023 12:27:43 CET |  | [http_xfer] [ID#0] HTTP: wrote 1420 bytes
So 22 Jan 2023 12:27:43 CET |  | [http_xfer] [ID#0] HTTP: wrote 218 bytes
So 22 Jan 2023 12:27:43 CET |  | [http_xfer] [ID#0] HTTP: wrote 931 bytes
So 22 Jan 2023 12:27:43 CET |  | [http] [ID#0] Info:  Connection #4229 to host www.google.com left intact
So 22 Jan 2023 12:27:43 CET |  | Internet access OK - project servers may be temporarily down.

While at the same time 3 new WUs got downloaded without any problem.
Grüße vom Sänger
ID: 67971 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 943
Credit: 34,276,661
RAC: 11,053
Message 67972 - Posted: 22 Jan 2023, 12:04:50 UTC
Last modified: 22 Jan 2023, 12:59:49 UTC

Mine went through a cycle:

22/01/2023 09:37:47 | climateprediction.net | Finished upload of oifs_43r3_ps_0030_2007050100_123_976_12192674_1_r1655111617_31.zip
22/01/2023 09:37:48 | climateprediction.net | Temporarily failed upload of oifs_43r3_ps_0431_2016050100_123_985_12202075_0_r1887730273_18.zip: transient HTTP error
22/01/2023 09:44:33 | climateprediction.net | Temporarily failed upload of oifs_43r3_ps_0041_1996050100_123_965_12181685_0_r1930143653_13.zip: connect() failed
There was another batch of uploads from 09:33 to 09:37, at normal speed. Then, a group of HTTP errors after long timeouts - I think we interpret that as an upload server crash. Finally, a series of almost-instantaneous connect failures, which are continuing as I type.

It does seem that the upload server doesn't anticipate the local disks filling up very well, and then takes an extraordinarily long time to pass on the surplus, and free enough space to allow normal service to be resumed. And then, it fills up again very quickly.
ID: 67972 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4345
Credit: 16,533,637
RAC: 5,933
Message 67973 - Posted: 22 Jan 2023, 12:52:48 UTC - in response to Message 67972.  

It does seem that the upload server doesn't anticipate the local disks filling up very well, and then takes an extraordinarily long tome to pass on the surplus, and free enough space to allow normal service to be resumed. And then, it fills up again very quickly.
Only one has gotten through for me this morning since I first looked about 0700UTC. I think it is going to be a long wait again.
ID: 67973 · Report as offensive     Reply Quote
SolarSyonyk

Send message
Joined: 7 Sep 16
Posts: 257
Credit: 31,932,017
RAC: 38,282
Message 67974 - Posted: 22 Jan 2023, 15:18:29 UTC - in response to Message 67965.  

So either you will be lucky and the upload server availability recovers soon enough. Or you will need to go through hoops to add storage to the VM while it is up and running. Or you will have to suspend the unstarted tasks and wait for the running tasks to complete and then shut the VM down. Or you could shut down the VM right away and risk the tasks erroring out after resumption. Or you could suspend the VM at extra charge for the provider's storing your VM state.


I just waited for the tasks to finish and shut the VM down. No point in paying to process units that may or may not get where they need to go. Doing the same for my onsite boxes, just going to let them finish and then wait until uploads clear out before resuming. It's becoming more hassle than it's worth to try and work around upload failures, so I may just put CPU cycles at something that can take uploads, or I may just shut them down for a while. It's really hard to dig out from a backlog - I only barely have the bandwidth to keep up with production, and I had machines running but with tasks suspended for quite a few days to try and get WUs up before deadlines.

If the infrastructure isn't stable, I'm done with heroics to try and work around it. I simply will wait until stuff works before bothering again.
ID: 67974 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1060
Credit: 16,538,338
RAC: 2,071
Message 67976 - Posted: 22 Jan 2023, 16:33:36 UTC

I am confused.

I used to go to climateprediction.net to get here and yesterday evening that failed. I could not even get anywhere. I had to change it to cpdn.org to get here today. Could that be why I cannot upload anything?

Checking if climateprediction.net is down or it is just you...
It's not just you! climateprediction.net is down.


Sun 22 Jan 2023 11:29:25 AM EST | climateprediction.net | Started upload of oifs_43r3_ps_0012_2009050100_123_978_12194656_0_r313555412_87.zip
Sun 22 Jan 2023 11:29:28 AM EST |  | Project communication failed: attempting access to reference site
Sun 22 Jan 2023 11:29:28 AM EST | climateprediction.net | Temporarily failed upload of oifs_43r3_ps_0012_2009050100_123_978_12194656_0_r313555412_87.zip: connect() failed
Sun 22 Jan 2023 11:29:28 AM EST | climateprediction.net | Backing off 00:02:48 on upload of oifs_43r3_ps_0012_2009050100_123_978_12194656_0_r313555412_87.zip
Sun 22 Jan 2023 11:29:30 AM EST |  | Internet access OK - project servers may be temporarily down.

ID: 67976 · Report as offensive     Reply Quote
xii5ku

Send message
Joined: 27 Mar 21
Posts: 79
Credit: 78,302,757
RAC: 1,077
Message 67977 - Posted: 22 Jan 2023, 17:38:54 UTC
Last modified: 22 Jan 2023, 17:48:51 UTC

Saenger wrote:
I got this with the last .zip for one WU: [...]
While at the same time 3 new WUs got downloaded without any problem.

Jean-David Beyer wrote:
I am confused.

I used to go to climateprediction.net to get here and yesterday evening that failed. I could not even get anywhere. I had to change it to cpdn.org to get here today. Could that be why I cannot upload anything?

Checking if climateprediction.net is down or it is just you...
It's not just you! climateprediction.net is down.
There are (at least) four physically different servers:

    www.climateprediction.net — Just a web site, basically unrelated to the CPDN BOINC operations. It's currently down for unknown reasons.
    (Actually, it is related to the CPDN BOINC functions in the way that the BOINC project URL is named www.climateprediction.net too. I suppose it is impossible to attach new clients to CPDN for as long as this web server is down.)


    www.cpdn.org — The main CPDN BOINC site. Hosts scheduler, BOINC's own web pages and message board, download server, validator, assimilator… This one is up and running well.


    upload11.cpdn.org — Currently hosts the upload file handler for Linux OpenIFS work. It's currently up but configured to accept only very few simultaneous HTTP connections. So few that most of our connection attempts are rejected. The reason for this is that this server ran out of disk space yesterday and first needs to offload a lot of data to another external storage server. It can accomplish this only if there isn't too much data incoming from client computers at the same time. Eventually this situation will be over and the admin will increase the allowed connection count again, somewhat.
    Expect this sort of unavailability to happen again and again until the current OpenIFS work is done. (Unless CPDN can afford a storage subsystem which has magnitudes more temporary space, or can set up a magnitudes faster outbound data link from the upload file handler to backing store.


    upload???.cpdn.org — Currently hosts the upload file handler for Windows Weather@Home work. I take it from user reports here that this server is down right now too. (I'm just guessing that because I don't have any W@H uploads myself.)


________

I hope this gives a picture why some things work and others don't.

ID: 67977 · Report as offensive     Reply Quote
Yeti
Avatar

Send message
Joined: 5 Aug 04
Posts: 171
Credit: 10,266,501
RAC: 29,602
Message 67978 - Posted: 22 Jan 2023, 20:19:49 UTC

HM,guys, I think in the moment CPDN seems to have more crunching power than the infrastructur can handle. So, I think, it is better for the project if I pause CPDN-Crunching for quit a while, until the infrastructure can handle the load.

For now, I let all my clients finish already downloaded tasks, but not download any new.


Supporting BOINC, a great concept !
ID: 67978 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 11,989,490
RAC: 23,624
Message 67979 - Posted: 22 Jan 2023, 23:37:29 UTC - in response to Message 67978.  

HM,guys, I think in the moment CPDN seems to have more crunching power than the infrastructur can handle.

Funny you say that as there's been talk of trying to increase the user base by making a VBox app. It sure seems like the project is finding out the hard way that introducing a new model type (OIFS) is not that easy and isn't a small undertaking. The biggest problem currently is the upload situation. There's also the credit issue that's started before the upload issue and more recently RAC issue. The main website is down too.

I hope we can at least finish out this contract and get everything processed and uploaded by the end of February.
ID: 67979 · Report as offensive     Reply Quote
Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · 22 · 23 . . . 25 · Next

Message boards : Number crunching : The uploads are stuck

©2024 climateprediction.net