climateprediction.net (CPDN) home page
Thread 'New work Discussion'

Thread 'New work Discussion'

Message boards : Number crunching : New work Discussion
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 63 · 64 · 65 · 66 · 67 · 68 · 69 . . . 91 · Next

AuthorMessage
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 64712 - Posted: 27 Oct 2021, 6:54:30 UTC
Last modified: 27 Oct 2021, 6:55:09 UTC

OK. My last batch 901 has finished:

First was the restart file, which took 3 seconds to upload
Then a 20 second gap
Then the trickle_up file was sent
Another 30 second gap
The zip 4 was uploaded at about 32 minutes before completion.
This took about 3.5 minutes to upload.

I missed all of these, so I don't know about sizes.

About 20 minutes later the task finished, and the out file was uploaded - 238.04
It happened so fast that I didn't see whether that was bytes or kilobytes. :(

So, if there's "no congestion at the traffic lights", there's plenty of time for each file to get through before the next part shows up. :)
ID: 64712 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 64713 - Posted: 27 Oct 2021, 7:43:42 UTC
Last modified: 27 Oct 2021, 8:37:45 UTC

Now instead of exceeding a file size you're talking about how many files are being uploaded at the same time. I'm now running 3,201 WUs of various projects so that will be next to impossible.
One of these commands in ones cc_config file may be useful:

<max_file_xfers>32</max_file_xfers>
<max_file_xfers_per_project>32</max_file_xfers_per_project>


The thing that needs changing is
<max_nbytes>150000000.000000</max_nbytes>


in client_state.xml but unless you have a slow or unreliable connection or i realise what caused the problem in my case, turning off internet access for BOINC to allow other half to have a Zoom meeting then this shouldn't be a problem. I don't recommend playing with this file if you don't need to and if you do, please do make a back up. I don't turn BOINC off before editing it but I do stop computation on all running tasks. The issue has cropped up a time or two in testing but any relevant testing for these obviously didn't include any of us turning off internet access at a critical time.

Edit: Another option of course is to just pause any task that is close to finishing when internet access might be uncompromised.

I think from our discussions Sarah is going to increase the limit for the third batch of tasks from this experiment. so on that one it shouldn't be an issue.
ID: 64713 · Report as offensive
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 1061
Credit: 36,748,059
RAC: 5,647
Message 64714 - Posted: 27 Oct 2021, 9:49:22 UTC - in response to Message 64713.  

The thing that needs changing is
<max_nbytes>150000000.000000</max_nbytes>
in client_state.xml ... I don't recommend playing with this file if you don't need to and if you do, please do make a back up. I don't turn BOINC off before editing it but I do stop computation on all running tasks. The issue has cropped up a time or two in testing but any relevant testing for these obviously didn't include any of us turning off internet access at a critical time.
That makes sense, but I disagree with the bit I've marked in red.

The BOINC client writes out a whole new copy of client_state.xml every minute or so (and keeps it's own backup as client_state_prev.xml), but it only ever reads it as the client is starting.

While the client is running, the whole darn thing is kept in memory, and if you try to edit the file copy, your changes will be over-written and lost at the next write. You need to stop the client (only - can keep the manager open) before starting to change anything. Oh, and use a plain-text editor - BOINC doesn't like the output written by 'clever' XML tools.
ID: 64714 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 64715 - Posted: 27 Oct 2021, 11:30:22 UTC

That makes sense, but I disagree with the bit I've marked in red.


Oops. As always you are right Richard.

Shows that it has been a while since I needed to do this.
ID: 64715 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 64716 - Posted: 27 Oct 2021, 12:49:17 UTC - in response to Message 64713.  

Out of idle curiosity, I looked at client-state.xml max_nbytes and got;

$ grep max_nbytes /var/lib/boinc/client_state.xml | wc -l
430


430 entries! Thank goodness I do not seem to need to change one of them. How would I figure out which one needs changing? Four of them looked like this:

    <max_nbytes>0.000000</max_nbytes>
    <max_nbytes>0.000000</max_nbytes>
    <max_nbytes>104857600.000000</max_nbytes>
    <max_nbytes>104857600.000000</max_nbytes>

ID: 64716 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 64717 - Posted: 27 Oct 2021, 14:23:59 UTC - in response to Message 64716.  

It's the upload zip for your current task name.
ID: 64717 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 64718 - Posted: 27 Oct 2021, 15:44:46 UTC - in response to Message 64712.  

So, if there's "no congestion at the traffic lights", there's plenty of time for each file to get through before the next part shows up. :)


Seems that way to me too.

Wed 27 Oct 2021 04:49:25 AM EDT | climateprediction.net | Started upload of hadam4h_h1bc_200602_4_920_012116922_0_r1467636988_restart.zip
Wed 27 Oct 2021 04:49:28 AM EDT | climateprediction.net | Finished upload of hadam4h_h1bc_200602_4_920_012116922_0_r1467636988_restart.zip
Wed 27 Oct 2021 04:49:58 AM EDT | climateprediction.net | Sending scheduler request: To send trickle-up message.
Wed 27 Oct 2021 04:50:30 AM EDT | climateprediction.net | Started upload of hadam4h_h1bc_200602_4_920_012116922_0_r1467636988_4.zip
Wed 27 Oct 2021 04:50:50 AM EDT | climateprediction.net | Finished upload of hadam4h_h1bc_200602_4_920_012116922_0_r1467636988_4.zip
Wed 27 Oct 2021 05:11:02 AM EDT | climateprediction.net | Started upload of hadam4h_h1bc_200602_4_920_012116922_0_r1467636988_out.zip
Wed 27 Oct 2021 05:11:03 AM EDT | climateprediction.net | Computation for task hadam4h_h1bc_200602_4_920_012116922_0 finished
Wed 27 Oct 2021 05:11:06 AM EDT | climateprediction.net | Finished upload of hadam4h_h1bc_200602_4_920_012116922_0_r1467636988_out.zip

ID: 64718 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 64719 - Posted: 27 Oct 2021, 19:58:00 UTC

And having done a search and replace with all the relevant zips and allowed them double the allowance, I can confirm that my task that has just finished is uploading without issue.

Always good to get confirmation that I haven't made a mistake editing that file. :)
ID: 64719 · Report as offensive
ProfileAlan K

Send message
Joined: 22 Feb 06
Posts: 491
Credit: 31,239,565
RAC: 15,580
Message 64720 - Posted: 27 Oct 2021, 22:50:13 UTC - in response to Message 64703.  

It looks as if some of the batches are set to 150000000 any way:-

<name>hadam4h_h01m_200702_4_920_012115276_1_r113933754_4.zip</name>
<nbytes>0.000000</nbytes>
<max_nbytes>150000000.000000</max_nbytes>

and batch 895 is set to 210000000!

Does anything need to be changed?
ID: 64720 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 64721 - Posted: 28 Oct 2021, 4:58:22 UTC - in response to Message 64720.  

It looks as if some of the batches are set to 150000000 any way:-

<name>hadam4h_h01m_200702_4_920_012115276_1_r113933754_4.zip</name>
<nbytes>0.000000</nbytes>
<max_nbytes>150000000.000000</max_nbytes>

and batch 895 is set to 210000000!

Does anything need to be changed?


My one that failed was set to 150000000. The zip4 is produced slightly before the task finishes and for most will upload before it finishes and reaches the check that makes it fail. This will only cause a problem if upload speeds are very slow, or internet access for BOINC is turned off or down for another reason. The zip4 size reported in the transfer tab is about 147MB binary making it slightly above the 150000000 limit but the file size check never occurs if it is uploaded before the task ends.

So for most of us, most of the time, it isn't a problem.
ID: 64721 · Report as offensive
klepel

Send message
Joined: 9 Oct 04
Posts: 82
Credit: 69,959,172
RAC: 3,923
Message 64727 - Posted: 28 Oct 2021, 23:23:30 UTC - in response to Message 64721.  

Just had two with the file size error... at least I understand now, what you were talking about.
And yes, I have a slow internet connection.
ID: 64727 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 64728 - Posted: 29 Oct 2021, 1:15:50 UTC

OK, I've just sent an email.
I also mentioned the extended error messages .
ID: 64728 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 64729 - Posted: 29 Oct 2021, 13:17:50 UTC

For my batch 921, zip 1 was 150.06Mb
That's sailing too close to the wind for safety.
I don't know yet what the project is going to do about these.
ID: 64729 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 64730 - Posted: 29 Oct 2021, 13:39:16 UTC - in response to Message 64729.  

I don't know yet what the project is going to do about these.


This is what I got on the task noticeboard page we have access to.

Thank you for this. I will update the limit for future batches. This shouldn’t be fundamentally different in size to the other CDDHDD batches


But, that was before anyone else posted that they had problems and before you noticed the size of your zip.
ID: 64730 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 64731 - Posted: 29 Oct 2021, 15:04:24 UTC - in response to Message 64729.  

I do not watch to see the sizes of the uploaded files, but I do know the maximum speed my Internet link can go, that is about 75 MegaBits/second. Sometimes I see it slightly faster. So let us say it is 10 MegaBytes per second. Of coure it can be slower if the Internet is conjested. So here are the latest uploads.
Thu 28 Oct 2021 02:03:43 AM EDT | climateprediction.net | Started upload of hadam4h_h0c7_200602_4_920_012115657_0_r77250837_restart.zip
Thu 28 Oct 2021 02:03:46 AM EDT | climateprediction.net | Finished upload of hadam4h_h0c7_200602_4_920_012115657_0_r77250837_restart.zip
[ 3 seconds so 30 Megabytes estimated.]

Thu 28 Oct 2021 02:04:14 AM EDT | climateprediction.net | Sending scheduler request: To send trickle-up message.
Thu 28 Oct 2021 02:04:47 AM EDT | climateprediction.net | Started upload of hadam4h_h0c7_200602_4_920_012115657_0_r77250837_4.zip
Thu 28 Oct 2021 02:05:46 AM EDT | climateprediction.net | Finished upload of hadam4h_h0c7_200602_4_920_012115657_0_r77250837_4.zip
[59 seconds, so 590 Megabytes estimated.]

Thu 28 Oct 2021 02:25:48 AM EDT | climateprediction.net | Computation for task hadam4h_h0c7_200602_4_920_012115657_0 finished
Thu 28 Oct 2021 02:25:50 AM EDT | climateprediction.net | Started upload of hadam4h_h0c7_200602_4_920_012115657_0_r77250837_out.zip
Thu 28 Oct 2021 02:25:53 AM EDT | climateprediction.net | Finished upload of hadam4h_h0c7_200602_4_920_012115657_0_r77250837_out.zip
[3 seconds, so 30 Megabytes/second.]


Now these speeds are the fastest I could be getting. If, instead, I look at the speeds CPDN says I am getting from this machine, they say
Average upload rate 1731.78 KB/sec
that is about 6 times slower than I assumed above, so those uploads are more likely to be 5 Megabytes, 100 Megabytes, and 5 Megabytes, respectively.

Here are some speed tests of my machine to the Internet. The test server is about 60 miles from me.
Timestamp 	     Download 	  Upload 	    Test Server	
10/29/2021 10:35:51  76.02 Mbps   89.09 Mbps  New York City, NY
10/25/2021 16:52:39  75.81 Mbps   60.33 Mbps  New York City, NY


Now for this work unit, CPDN reports
Peak working set size 	1,380.64 MB
Peak swap size 	        1,398.43 MB
Peak disk usage    	 12.91 MB


Since I have 65 GBytes of RAM, I have no trouble running several of these.
If my peak disk usage is only 13 Megabytes and I even uploaded it all, this does not seem to be very much.
ID: 64731 · Report as offensive
Profilegeophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2187
Credit: 64,822,615
RAC: 5,275
Message 64732 - Posted: 29 Oct 2021, 15:32:39 UTC - in response to Message 64731.  


Now for this work unit, CPDN reports
Peak working set size 	1,380.64 MB
Peak swap size 	        1,398.43 MB
Peak disk usage    	 12.91 MB


Since I have 65 GBytes of RAM, I have no trouble running several of these.
If my peak disk usage is only 13 Megabytes and I even uploaded it all, this does not seem to be very much.

The peak disk usage is the largest size of the slots directory that task is using. In most boinc projects, this is where the file reading and writing is being done. However, cpdn uses the ..../projects/climateprediction.net/task name directory for it's reading and writing.
ID: 64732 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 64733 - Posted: 29 Oct 2021, 18:02:12 UTC - in response to Message 64732.  


The peak disk usage is the largest size of the slots directory that task is using. In most boinc projects, this is where the file reading and writing is being done. However, cpdn uses the ..../projects/climateprediction.net/task name directory for it's reading and writing.


OK: here are my disk usages in the relevant directories. directory . is /var/lib/boinc Of course, these are just what is there now.
18232M  .
13825M  ./projects
11332M  ./projects/climateprediction.net
4404M   ./slots
3008M   ./projects/climateprediction.net/hadam4h_11cx_209902_4_921_012119079
2567M   ./projects/climateprediction.net/hadam4h_b0sx_201211_5_882_012036049
2262M   ./projects/climateprediction.net/hadam4h_h12y_200902_4_920_012116620
2151M   ./projects/climateprediction.net/hadam4h_10x3_209602_4_921_012118509
1905M   ./projects/climateprediction.net/hadam4h_b0sx_201211_5_882_012036049/datain
1694M   ./projects/climateprediction.net/hadam4h_11cx_209902_4_921_012119079/datain
1694M   ./projects/climateprediction.net/hadam4h_10x3_209602_4_921_012118509/datain
1691M   ./projects/climateprediction.net/hadam4h_h12y_200902_4_920_012116620/datain
1547M   ./projects/boinc.bakerlab.org_rosetta
1518M   ./projects/climateprediction.net/hadam4h_b0sx_201211_5_882_012036049/datain/ancil
1313M   ./projects/climateprediction.net/hadam4h_11cx_209902_4_921_012119079/dataout
1306M   ./projects/climateprediction.net/hadam4h_11cx_209902_4_921_012119079/datain/ancil
1306M   ./projects/climateprediction.net/hadam4h_10x3_209602_4_921_012118509/datain/ancil
1304M   ./projects/climateprediction.net/hadam4h_h12y_200902_4_920_012116620/datain/ancil

ID: 64733 · Report as offensive
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4541
Credit: 19,039,635
RAC: 18,944
Message 64734 - Posted: 29 Oct 2021, 19:38:46 UTC

they say
Average upload rate 1731.78 KB/sec


Mine peaks at about 90KB/sec on bored band. about twice that if I tether my phone and use 4G. Even then, most of the time, the upload happens quickly enough not to get checked and cause a failure. The problem comes if there is a lot of other internet activity at the same time such as other half and possibly myself having zoom meetings at the same time and other concurrent uploads.

Sadly, though it has been asked for a number of times, BOINC does not allow selective pausing of uploads/downloads.
ID: 64734 · Report as offensive
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 64735 - Posted: 29 Oct 2021, 20:57:18 UTC - in response to Message 64733.  

What is on your HDs isn't important. It's the size of the zip, and once it's uploaded, there's no record of that.
So you need to Suspend net access, wait until a zip is created, and then write it down from the Transfers tab.
Then Resume net access.

I'm in contact with the researcher about this, but haven't heard back from my 3rd email.

Oh, and it's the weekend again. :(
ID: 64735 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 64736 - Posted: 30 Oct 2021, 1:27:07 UTC - in response to Message 64735.  

What is on your HDs isn't important. It's the size of the zip, and once it's uploaded, there's no record of that.


I believe am beginning to understand.
I am not running out of disk space for these upload files. It is the upload server that is unwilling to accept some (the too-large) of them. But maybe my boinc-client is running a check to prevent my sending too-large .zip files to the upload server.

Do I have this right?
ID: 64736 · Report as offensive
Previous · 1 . . . 63 · 64 · 65 · 66 · 67 · 68 · 69 . . . 91 · Next

Message boards : Number crunching : New work Discussion

©2024 cpdn.org