climateprediction.net home page
Posts by mmonnin

Posts by mmonnin

1) Message boards : Number crunching : Upload failures (Message 60966)
Posted 23 days ago by mmonnin
Post:
I had an anz50 task stuck uploading the last 4 zips, a restart and out file. I think it happened around when the server went down. I aborted the uploads as the task had 20 trickles and then it changed to completed.
https://www.cpdn.org/result.php?resultid=21741910
2) Message boards : Number crunching : Irritated - some questions (Message 60745)
Posted 30 Jul 2019 by mmonnin
Post:
If you look at the due dates, they are one year out. Speed is not their thing.

the other times most best is shut them down, and save electric.

Thanks


I, and others, run other projects when there is no CPDN work available. CPDN has a lack of work, not a lack of CPU power.
3) Message boards : Number crunching : Credits (Message 60722)
Posted 26 Jul 2019 by mmonnin
Post:
Credit at CPDN has been updated. Step 1 complete.

Now the external stats files need updated. This still hasn't updated since 6/24
https://www.cpdn.org/cpdnboinc/stats/

Data from 2015:
https://www.climateprediction.net/stats/
4) Message boards : Number crunching : Credit scoring question. (Message 60721)
Posted 26 Jul 2019 by mmonnin
Post:
Yup, the other tasks have the same 19,196.51 total credit as the other user.

Be glad this isn't CreditNew where your own credit will vary depending on your wingman's processor.
5) Message boards : Number crunching : BOINC Client Improvements (Message 60711)
Posted 24 Jul 2019 by mmonnin
Post:
I know it is possible to set specific hours to either do work or access the network. How easy would it be to have it so that it woulde be possible to set specific hours to use all processors and a reduced number at other times e.g. use 50% 0f processors between 08:30 and 20:30 and 100% at other times?


My first thought was a script/windows scheduler to swap app_config files to use max # of concurrent tasks. Restart the client after the swap unless there is a boinccmd to re-read app_configs.
6) Message boards : Number crunching : Boinc config question (Message 60710)
Posted 24 Jul 2019 by mmonnin
Post:
I have no idea why it was set to take so long for resource share to get to the desired ratio between projects. I don't see the benefit of having a setting that takes a month to get to the desired effect.

Very long tasks like CPDN can mess with that as well. Users are better off with setting a project_max_concurrent. That can put a host in a spot where there are CPDN tasks available but limited by that option and no other tasks will download from other projects because the queue is full. Cores can end up idle.

If I end up downloading more CPDN tasks that I want to run at once I will suspend any buffer and extra tasks I do not want to run. That stops more from downloading and clears up the queue for another project. Set CPDN to something like 1000 resource share so they never get interrupted and another project at 1% so it won't take over anything but the left over cores. It's above 0% so a queue will download.

When some CPDN tasks complete I release some more to run. Repeat.
7) Message boards : Number crunching : Credits (Message 60689)
Posted 21 Jul 2019 by mmonnin
Post:
Over/under on the # of weeks until this gets corrected?
8) Message boards : Number crunching : Long Project and other working projects disappeared, now few tasks performing. Questions. (Message 60679)
Posted 19 Jul 2019 by mmonnin
Post:
These?
https://www.cpdn.org/cpdnboinc/results.php?hostid=1328590
9) Message boards : Number crunching : Credits (Message 60672)
Posted 15 Jul 2019 by mmonnin
Post:
This time, the stats at your profile have not been affected. Only external stats sites. Read through the threads.

Sorry, not true. The number of credits on climateprediction.net is higher than those external sites show, but still short over 2 million credits from what they should be.


That is exactly what I said. Your profile, here at CPDN, is correct and is higher. Stats site have not updated (BOINCStats) or have reverted to old stats(Free-DC). The stats here also did not update last week. Like the post right above yours said an admin is being emailed about restarting the stats script.
10) Message boards : Number crunching : Credits (Message 60662)
Posted 14 Jul 2019 by mmonnin
Post:
On June 24 I lost about 2.3 million credits. This happened once before and that time I saw messages that indicated that there was a general failure and other uses besides myself were affected. Eventually the problem was fixed and the credits lost on that previous occasion were restored. This time I found no such messages and wonder if this time the loss was just for me. Is there anyone I can contact who can look into this?

Thanks.


That older situation was completely different. Previously the server crashed, a db restored with a lower credit for everyone. Later restored as mentioned.

This time, the stats at your profile have not been affected. Only external stats sites. Read through the threads.
11) Message boards : Number crunching : BOINC Client Improvements (Message 60638)
Posted 11 Jul 2019 by mmonnin
Post:
By BOINC client, you actually mean the part of BOINC behind the scenes and not the GUI? Or the whole BOINC install? Or BOINC Mgr? Some use the terms interchangeably.

If it includes the mgr, take a look at the program BOINCTasks. It already has a lot of improvements to the GUI that should be in the Mgr.

If its the actual client then many people have asked about managing hardware via slots. I guess kind of like the how FAH handles hardware. GPU 1 can be setup with X settings in slot 1, while GPU 2 can be setup with Y settings in slot 2. 3 CPU threads can run this task in slot 4, while 5 more threads in slot 4. For BOINC that could be set 3 threads on one project, 5 on another project. For finer control like this, separate/concurrent clients are needed to setup the GPUs on separate projects with separate queues. Or CPU queues different than GPU queues.

I never want tasks to be interrupted. Never ever stop a task to run something else. Ever. Discard a task if it will miss a deadline before an already task is running is stopped for another.

Separate the Run priority/Resource Share into 2 parts. Which project has its tasks downloaded via resource share and which project has its tasks run next. I'd much rather have the tasks run FIFO.

Setting "No New work" should not affect the tasks that have already been downloaded/in queue from running. The client thinks its OK to leave those tasks until the minute before their deadline. Again FIFO and let resource share control task download, not task running order.

Make resource share instant, not a gradual change over time. Or an option for it. So many people complain at project sites because they don't know how this works and assume a new project is controlling task download when its really the client.
12) Message boards : Cafe CPDN : BOINC CONFERENCE (Message 60629)
Posted 10 Jul 2019 by mmonnin
Post:
There are several possible origins of the nickname. The one mentioned on the Hancock tower tour is when Chicago lobbied to host the World Fair.
13) Message boards : Number crunching : Upload failures (Message 60628)
Posted 10 Jul 2019 by mmonnin
Post:
Thanks for the history on the cam25's. I found the one in question, and it has returned 18 zips.
https://www.cpdn.org/cpdnboinc/result.php?resultid=21709022

However, the ones that are stuck are #12 and #13. So it looks like they got lost in the shuffle.
If they have not uploaded by the time my other work has finished tomorrow, I will just can them
(as in trash can; I just realized that may not be clear to non-native English speakers).

What I've done with previous CAM25s that have persistently stuck uploads is just abort the transfers that are obviously stuck. After aborting the transfers, it will report the task, possibly as a success, and the scientists can determine whether the output is useful without the missing zips. I've only done this with the CAM25 tasks however since those are the ones that seem to occasionally have the rogue stuck uploads.


This worked for me too. Trickle _15 was stuck at 51%. After reading this I checked and 18 trickles were uploaded according to the task stats. Aborted and it updated to successful.
14) Message boards : Number crunching : Upload failures (Message 60585)
Posted 4 Jul 2019 by mmonnin
Post:
CPDN has a track record of computation errors when suspending tasks so I sure ignored that part. I don't blame any one else for it either.

I left the 2 that had not started yet suspended but the ones that had started I let them complete even if the uploads will take awhile.


Never seen a problem from suspending tasks if BOINC isn't stopped and restarted. Also a long time since even doing that I have lost a Windows task.

Just to reiterate so the information stays near the top of the thread, Clearing the data and the backlog of people still uploading data some of whom have several hundred gigabytes means that it could easily be a week or more before the problems stop completely. Also no need to suspend any tasks other than sam50's as they go to different servers.


How soon you forget. You started this thread.
https://www.cpdn.org/cpdnboinc/forum_thread.php?id=8701#59554

CPDN tasks are some of the most fragile tasks of all the BOINC projects. Most have no issues suspending or at least going back to the last checkpoint. Even if they did go back the last checkpoint, no one wants to lose several days of work. There's a higher chance of losing work from suspending than from a task trickle upload being lost.
15) Message boards : Number crunching : Upload failures (Message 60576)
Posted 3 Jul 2019 by mmonnin
Post:
CPDN has a track record of computation errors when suspending tasks so I sure ignored that part. I don't blame any one else for it either.

I left the 2 that had not started yet suspended but the ones that had started I let them complete even if the uploads will take awhile.
16) Message boards : Number crunching : Upload failures (Message 60511)
Posted 30 Jun 2019 by mmonnin
Post:
The client version has nothing to do with a full disk on a project server.
17) Questions and Answers : Getting started : "Communications deferred" ... continuously (Message 60451)
Posted 26 Jun 2019 by mmonnin
Post:
See this thread. You're not alone.
https://www.cpdn.org/cpdnboinc/forum_thread.php?id=8744
18) Message boards : Number crunching : Upload failures (Message 60435)
Posted 24 Jun 2019 by mmonnin
Post:
All of my Linux zips on three machines have gone, so that is progress. But I have over 50 WAH2 zips on my windows machine still stuck. It seems to be discrimination against North America.


Same here. SAM50 and SAFR50 are pending. Those files are several times as big. ~17MB compared to 76/92MB each.
19) Message boards : Number crunching : Free-DC reports negative credits today for CPDN (Message 60434)
Posted 24 Jun 2019 by mmonnin
Post:
A week or two someone also reported that Free-DC cannot get new stats from CPDN and the issue is on CPDN site. So issues ;)


The difference here is that credit reversed on Free-DC while other weeks there was just no update at Free-DC. This time the CPDN site was down. I'm not sure how often Free-DC queries CPDN. It might be a week until it goes back to the data from a couple of weeks ago.

Free-DC also has 2 sets of data to update and display the site. At times they can get out of sync.
20) Message boards : Number crunching : Free-DC reports negative credits today for CPDN (Message 60420)
Posted 24 Jun 2019 by mmonnin
Post:
It just loaded an old copy of data since the site was down when it tried to pull stats. It's happened plenty of times at Free-DC. Mine went back to a familiar # for me.


Next 20

©2019 climateprediction.net