climateprediction.net home page
Welcome back/checking if everything is working?

Welcome back/checking if everything is working?

Message boards : Number crunching : Welcome back/checking if everything is working?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4342
Credit: 16,501,246
RAC: 5,648
Message 62876 - Posted: 6 Nov 2020, 20:32:01 UTC - in response to Message 62875.  

How projects play together is something I know little about because I only run CPDN tasks except when none are available so my knowledge of it is nearly all from reading posts here and on the BOINC fora.


Do you manage to have CPDN running non stop? Is there enough Linux work to keep it busy? Or do you just let the computer doze off in between? I like my 66 CPU cores and 4 GPUs to be doing something all the time. My wallet does not.


Currently I am going to Africa Rainfall Project with World Community Grid during breaks in work. I also occasionally get testing work when no main site work is available but it is not unusual to go for a week or even a month or more without work. I can also run Windows tasks under WINE if only windows work is available.
ID: 62876 · Report as offensive     Reply Quote
KAMasud

Send message
Joined: 6 Oct 06
Posts: 204
Credit: 7,608,986
RAC: 0
Message 62877 - Posted: 7 Nov 2020, 3:06:29 UTC - in response to Message 62875.  

How projects play together is something I know little about because I only run CPDN tasks except when none are available so my knowledge of it is nearly all from reading posts here and on the BOINC fora.


Do you manage to have CPDN running non stop? Is there enough Linux work to keep it busy? Or do you just let the computer doze off inbetween? I like my 66 CPU cores and 4 GPUs to be doing something all the time. My wallet does not.

----------------------------------------------

I have twelve cores and they are dedicated to CPDN only(CPU). I do not do any other CPU project. I also live in an area of the world where the ambient temperatures are high, so I cannot afford to run all cores. Having more cores at least for me is no fun if I end up burning equipment. Having said that, even though dedicated to CPDN I only do one task at a time. The upside to it is faster run times, error-free results, less power cost plus less cooling cost.
I have also observed over time if I run all my cores, errors start to creep in. Plus I have to feel that there is also an element of interference across cores especially if other projects are also being run in tandem. Maybe it might be due to the GPU but no, if I run Rosetta then why does the error rate increase?
For me at least, one task at a time is the way forward.
------------------------------------
As to the Server State Page and the quantity of work being shown, please someone take a broom to it. Two burnt-up laptops which might have been recycled years ago, their WU"s are still being shown as active. Don't ask me how.
ID: 62877 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 62879 - Posted: 7 Nov 2020, 12:20:44 UTC - in response to Message 62877.  

I have twelve cores and they are dedicated to CPDN only(CPU). I do not do any other CPU project. I also live in an area of the world where the ambient temperatures are high, so I cannot afford to run all cores.
Is it not possible to fit a larger cooling fan to the CPU, or even a watercooler? Running only one core is not doing much processing.

Having more cores at least for me is no fun if I end up burning equipment. Having said that, even though dedicated to CPDN I only do one task at a time. The upside to it is faster run times, error-free results, less power cost plus less cooling cost.
I have also observed over time if I run all my cores, errors start to creep in. Plus I have to feel that there is also an element of interference across cores especially if other projects are also being run in tandem. Maybe it might be due to the GPU but no, if I run Rosetta then why does the error rate increase?
For me at least, one task at a time is the way forward.
I think there may be a problem if CPDN tasks are suspended and resumed (due to rebooting, exclusive application like a game, or Boinc switching to Rosetta.

I'm noticing the CPDN tasks I'm running have stopped giving me credit, I got credit right near the start, now nothing, maybe they are all damaged? I know three of them said computation error after all I did was restart a machine. Can anyone check please?

As to the Server State Page and the quantity of work being shown, please someone take a broom to it. Two burnt-up laptops which might have been recycled years ago, their WU"s are still being shown as active. Don't ask me how.
Agreed - that page serves no purpose whatsoever, it's just showing meaningless numbers and helps nobody.
ID: 62879 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 62884 - Posted: 7 Nov 2020, 12:59:56 UTC - in response to Message 62879.  

Up until now, credits have been based on the receipt of trickle_up files.
No trickles, no credits.

A new system is on it's way, but it will be a while yet.
ID: 62884 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4342
Credit: 16,501,246
RAC: 5,648
Message 62885 - Posted: 7 Nov 2020, 13:00:42 UTC

I'm noticing the CPDN tasks I'm running have stopped giving me credit, I got credit right near the start, now nothing, maybe they are all damaged? I know three of them said computation error after all I did was restart a machine. Can anyone check please?


Currently the credit script only runs on Thursdays so you should get a weekly update?
ID: 62885 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4342
Credit: 16,501,246
RAC: 5,648
Message 62887 - Posted: 7 Nov 2020, 13:05:50 UTC

I have twelve cores and they are dedicated to CPDN only(CPU). I do not do any other CPU project. I also live in an area of the world where the ambient temperatures are high, so I cannot afford to run all cores. Having more cores at least for me is no fun if I end up burning equipment. Having said that, even though dedicated to CPDN I only do one task at a time. The upside to it is faster run times, error-free results, less power cost plus less cooling cost.
I have also observed over time if I run all my cores, errors start to creep in. Plus I have to feel that there is also an element of interference across cores especially if other projects are also being run in tandem. Maybe it might be due to the GPU but no, if I run Rosetta then why does the error rate increase?
For me at least, one task at a time is the way forward.


I haven't seen problems with concurrent projects however, running all 8 cores (16 threads) on my Ryzen results in a lower throughput than if I only run 8 tasks at a time with the N216 tasks. This is because they use up a lot of the level3 cache, about 3MB/task or a little more. I don't know if running another project that has heavy use of level 3 cache is the problem?
ID: 62887 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 62888 - Posted: 7 Nov 2020, 13:16:51 UTC - in response to Message 62885.  
Last modified: 7 Nov 2020, 13:20:48 UTC

I'm noticing the CPDN tasks I'm running have stopped giving me credit, I got credit right near the start, now nothing, maybe they are all damaged? I know three of them said computation error after all I did was restart a machine. Can anyone check please?
Currently the credit script only runs on Thursdays so you should get a weekly update?
Thanks, that explains it. I think I got my credits this Thursday and assumed I was going to get some for every 10% done or something. I'll work on the assumption that the ones which didn't say "computation error" are doing something useful. I got 36 tasks, and 3 caused errors. I'm putting those down to problems I had with a new GPU that was crashing the OS. It seems CPDN tasks can't cope with that - maybe it's deliberate to take the task away from an unstable machine - fair enough. They're managing ok if they're cleanly paused (eg Boinc swapping projects or a game being played and pausing them with exclusive applications).
ID: 62888 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4342
Credit: 16,501,246
RAC: 5,648
Message 62891 - Posted: 7 Nov 2020, 16:47:25 UTC - in response to Message 62888.  

I'm noticing the CPDN tasks I'm running have stopped giving me credit, I got credit right near the start, now nothing, maybe they are all damaged? I know three of them said computation error after all I did was restart a machine. Can anyone check please?
Currently the credit script only runs on Thursdays so you should get a weekly update?
Thanks, that explains it. I think I got my credits this Thursday and assumed I was going to get some for every 10% done or something. I'll work on the assumption that the ones which didn't say "computation error" are doing something useful. I got 36 tasks, and 3 caused errors. I'm putting those down to problems I had with a new GPU that was crashing the OS. It seems CPDN tasks can't cope with that - maybe it's deliberate to take the task away from an unstable machine - fair enough. They're managing ok if they're cleanly paused (eg Boinc swapping projects or a game being played and pausing them with exclusive applications).


The trickle ups for which credit is give match the zips files uploaded at the end of each model month. So on this task of yours,
wah2_sam50_a09k_201312_25_885_012039607_0 it is a 25 month task so every 4%. The other parts of the task number of interest are sam50 which tells you it is for the South America region and a resolution of 50Km squares, 201312 gives the year and start month of the task and 885 is the batch number.
ID: 62891 · Report as offensive     Reply Quote
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 62893 - Posted: 7 Nov 2020, 17:09:13 UTC - in response to Message 62891.  

The trickle ups for which credit is give match the zips files uploaded at the end of each model month. So on this task of yours,
wah2_sam50_a09k_201312_25_885_012039607_0 it is a 25 month task so every 4%. The other parts of the task number of interest are sam50 which tells you it is for the South America region and a resolution of 50Km squares, 201312 gives the year and start month of the task and 885 is the batch number.
Thanks, now I can tell exactly what it's working on.

That task is running on one of my slower (per core) machines and has only just reached 8%. The faster ones are now at up to 44%, and are also 25 month tasks, so should have done 10 trickles by now, so I assume I have to wait till Thursday to see credits.

I'm not one of those credit addicts, I just like to see them to know it's working properly! A complete loss of new credits on one machine with LHC made me look for a problem with VirtualBox.

Primegrid (a newly added project for me) gave me way too many tasks when I attached, and got in the way, I've kicked Boinc and made it do the CPDN. Just how soon should I be aiming to get them in?
ID: 62893 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1056
Credit: 16,521,771
RAC: 1,278
Message 63054 - Posted: 30 Nov 2020, 12:00:17 UTC - in response to Message 62739.  

I think uploads from batch 860 go to a server in Tasmania. I have let Andy know.


Is this the same problem? I started getting it yesterday and it is all I get from ClimatePrediction since then. The web site works OK.

Mon 30 Nov 2020 12:07:29 AM EST | climateprediction.net | Not requesting tasks: don't need (not highest priority project)
Mon 30 Nov 2020 12:07:31 AM EST | climateprediction.net | Scheduler request completed
Mon 30 Nov 2020 12:07:31 AM EST | climateprediction.net | Project requested delay of 3636 seconds
Mon 30 Nov 2020 12:08:39 AM EST | | Project communication failed: attempting access to reference site
Mon 30 Nov 2020 12:08:41 AM EST | | Internet access OK - project servers may be temporarily down.
ID: 63054 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4342
Credit: 16,501,246
RAC: 5,648
Message 63055 - Posted: 30 Nov 2020, 13:15:50 UTC - in response to Message 63054.  
Last modified: 30 Nov 2020, 17:47:22 UTC

I think uploads from batch 860 go to a server in Tasmania. I have let Andy know.


Is this the same problem? I started getting it yesterday and it is all I get from ClimatePrediction since then. The web site works OK.

Mon 30 Nov 2020 12:07:29 AM EST | climateprediction.net | Not requesting tasks: don't need (not highest priority project)
Mon 30 Nov 2020 12:07:31 AM EST | climateprediction.net | Scheduler request completed
Mon 30 Nov 2020 12:07:31 AM EST | climateprediction.net | Project requested delay of 3636 seconds
Mon 30 Nov 2020 12:08:39 AM EST | | Project communication failed: attempting access to reference site
Mon 30 Nov 2020 12:08:41 AM EST | | Internet access OK - project servers may be temporarily down.


Might have been a temporary glitch as I just got scheduler request completed and the delay message without the Project communication failed one but it is a different issue. The servers around the world to send the zips to can go down without anything being wrong at Oxford. I will check again in a few hours time as I have a couple of tasks that should finish then.

Edit: Two tasks finished and have uploaded with no problems so whatever it was seems to have cleared.
Please do not private message myself or other moderators for help. This limits the number of people who are able to help and deprives others who may benefit from the answer.
ID: 63055 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 63056 - Posted: 30 Nov 2020, 18:51:05 UTC - in response to Message 63054.  

Jean

That's 2 different things - the ANZ uploads are "uploads", and what your messages are saying, is that your computer is asking for work, which is "downloads".
And it's also saying that BOINC doesn't want more work, because it has enough already.
ID: 63056 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1056
Credit: 16,521,771
RAC: 1,278
Message 63061 - Posted: 30 Nov 2020, 22:45:08 UTC - in response to Message 63055.  

Might have been a temporary glitch as I just got scheduler request completed and the delay message without the Project communication failed one but it is a different issue. The servers around the world to send the zips to can go down without anything being wrong at Oxford. I will check again in a few hours time as I have a couple of tasks that should finish then.

Edit: Two tasks finished and have uploaded with no problems so whatever it was seems to have cleared.


I guess it was a long temporary glitch: Lasted all day yesterday, but it is working OK starting, perhaps, this afternoon.
ID: 63061 · Report as offensive     Reply Quote
marmot

Send message
Joined: 12 May 05
Posts: 34
Credit: 1,357,324
RAC: 875
Message 63127 - Posted: 18 Dec 2020, 21:10:44 UTC

Dedicating this Dell Latitude laptop to CPDN WU's.
Installed Linux Mint 20 and have run into an issue; climateprediction.net is unaccessable so continuously get 'project communication failed' (although I was able to connect to my account using cpdn.org link).
It's not just my issue:
https://isdown.me/www.climateprediction.net reports the website is down.

This is the most recent thread that a search showed a user mentioning this issue so I posted here instead of a new post.
ID: 63127 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,487,091
RAC: 4,506
Message 63128 - Posted: 18 Dec 2020, 22:23:03 UTC - in response to Message 63127.  

Dedicating this Dell Latitude laptop to CPDN WU's.
Installed Linux Mint 20 and have run into an issue; climateprediction.net is unaccessable so continuously get 'project communication failed' (although I was able to connect to my account using cpdn.org link).
It's not just my issue:
https://isdown.me/www.climateprediction.net reports the website is down.

This is the most recent thread that a search showed a user mentioning this issue so I posted here instead of a new post.

I e-mailed the cpdn computer people about the problem. Not sure when it will be fixed as it is the weekend already in Oxford.
ID: 63128 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,487,091
RAC: 4,506
Message 63130 - Posted: 19 Dec 2020, 17:23:52 UTC - in response to Message 63128.  

Dedicating this Dell Latitude laptop to CPDN WU's.
Installed Linux Mint 20 and have run into an issue; climateprediction.net is unaccessable so continuously get 'project communication failed' (although I was able to connect to my account using cpdn.org link).
It's not just my issue:
https://isdown.me/www.climateprediction.net reports the website is down.

This is the most recent thread that a search showed a user mentioning this issue so I posted here instead of a new post.

I e-mailed the cpdn computer people about the problem. Not sure when it will be fixed as it is the weekend already in Oxford.


https://www.climateprediction.net should be back up and will hopefully resolve any problems connecting a computer to it.
ID: 63130 · Report as offensive     Reply Quote
marmot

Send message
Joined: 12 May 05
Posts: 34
Credit: 1,357,324
RAC: 875
Message 63145 - Posted: 20 Dec 2020, 20:57:32 UTC - in response to Message 63130.  
Last modified: 20 Dec 2020, 20:58:26 UTC



https://www.climateprediction.net should be back up and will hopefully resolve any problems connecting a computer to it.


That laptop is communicating with that domain now. Thanks.

I could have edited the project .xml and changed it to cpdn.net to get an immediate fix but I did a normal install (instead of dropping all of BOINC in the user HOME directory) and so the files with the domain were locked. The root account could have taken ownership temporarily ... but I got lazy.

Gonna wait for a few days as the new Mint 19.3 install dropped into fallback mode running gaia@home. Maybe one of the widgets for CPU, temps or process on the task bar crashed the OS (didn't seem like overheating issue at 65C). These CPDN WU's demand days of stability from what I've read.

Mint 20 is out (20.1 in beta) but for some reason Wine absolutely failed to function.
ID: 63145 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3

Message boards : Number crunching : Welcome back/checking if everything is working?

©2024 climateprediction.net