climateprediction.net home page
Excessive checkpointing on new Linux hadcm3s tasks?

Excessive checkpointing on new Linux hadcm3s tasks?

Message boards : Number crunching : Excessive checkpointing on new Linux hadcm3s tasks?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
alanb1951

Send message
Joined: 31 Aug 04
Posts: 32
Credit: 9,526,696
RAC: 109,831
Message 61008 - Posted: 26 Sep 2019, 9:32:46 UTC

I recently landed a few of the new Linux tasks (batch 835). I don't usually turn on checkpoint debug in BOINC-Manager, but I had cause to need to do so on one of my machines and was surprised to see that these tasks were checkpointing about once a minute! I turned the logging on on my other machine that had some CPDN work and it was the same! (Turned logging off again!)

Now, on one machine I've got the checkpointing limit set to 600 seconds and on the other 240 seconds; it's obviously not respecting that!

As I said, I don't normally monitor this, so for all I know this could have been standard behaviour for as long as hadcm3s tasks have been available. Alternatively, it might only be doing this on my machines (though I suspect that's unlikely).

This is not exactly disc-I/O friendly if it's deliberate so I wonder has it always been like this, is this a side-effect of them trying to make Linux tasks more crash-proof, or is it a bug.

Any insight appreciated - Al.
ID: 61008 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4341
Credit: 16,497,933
RAC: 6,477
Message 61010 - Posted: 26 Sep 2019, 9:54:12 UTC - in response to Message 61008.  

Just checked on mine where it is about every five minutes but on a much slower machine. I think this is how it has always been for hadcm3s but don't have any evidence to back that up.
ID: 61010 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 61012 - Posted: 26 Sep 2019, 11:08:17 UTC - in response to Message 61008.  

This is not exactly disc-I/O friendly if it's deliberate so I wonder has it always been like this, is this a side-effect of them trying to make Linux tasks more crash-proof, or is it a bug.

I am glad you mentioned writes. I routinely check them in order to properly set up a write cache to protect my SSD.
On a Ryzen 2600, running hadcm3s work units on all 12 cores, the writes are about 350 GB/day. I set up a 4 GB write cache in Linux, with 30 minute latency (write-delay).

In case you are interested, for Ubuntu the commands are:
Set write cache to 4 GB/4.5 GB: for 16 GB main memory 
sudo sysctl vm.dirty_background_bytes=4000000000 (268435456 default)
sudo sysctl vm.dirty_bytes=4500000000 (1073741824  x4 default)
sudo sysctl vm.dirty_writeback_centisecs=500  (checks the cache every 5 seconds)
sudo sysctl vm.dirty_expire_centisecs=180000 (page flush 30 min.; 3000 default)

Whether this is really necessary to protect a typical SSD I don't know, but I prefer caution.
ID: 61012 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 61026 - Posted: 26 Sep 2019, 22:54:33 UTC - in response to Message 61012.  

And I just checked the writes on my Ryzen 3700x, which is running hadcm3s on all 16 of the cores. Since the writes are rather variable, I used iostat 7200, which measures over a two-hour period. The writes were 500 GB/day, too much for me without a write-cache for protection. In fact, my limit is 70 GB/day.
ID: 61026 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4341
Credit: 16,497,933
RAC: 6,477
Message 61027 - Posted: 27 Sep 2019, 6:04:51 UTC - in response to Message 61026.  

This has been raised with the project. Check points are each model day. When this model type was introduced, computers were a lot slower and solid state disks didn't even exist or if they did cost an arm and a leg for even a 40GB one. So the problem has sort of crept up on us. I don't know how quick a fix it is to change checkpoints to every ten days or even monthly giving 12 checkpoints per zip file?
ID: 61027 · Report as offensive     Reply Quote
alanb1951

Send message
Joined: 31 Aug 04
Posts: 32
Credit: 9,526,696
RAC: 109,831
Message 61029 - Posted: 27 Sep 2019, 7:04:13 UTC - in response to Message 61027.  

This has been raised with the project. Check points are each model day. When this model type was introduced, computers were a lot slower and solid state disks didn't even exist or if they did cost an arm and a leg for even a 40GB one. So the problem has sort of crept up on us. I don't know how quick a fix it is to change checkpoints to every ten days or even monthly giving 12 checkpoints per zip file?

Dave,

Thanks for this!

I hope they can (and do) change this because I will not be running CPDN on my next system (Ryzen 3700X, I hope) if it's going to be hammering the discs like that if I get hadcm3s work units. (Thanks for the numbers and cache tuning stuff, Jim1348!)

I presume we aren't ever going to get the facility to deselect certain applications back; if that's the case they ought to try to make details like checkpoint frequency as consistent as they can across all applications available on a given platform (at least as far as the most frequent checkpointing is concerned).

I can understand an application ignoring the user's checkpoint guidelines by not checkpointing as often as the user allows, but checkpointing more often ought to be a no-no, as this situation demonstrates!... (It was theoretically possible to determine that limit and if the limit was reasonable enable some code on the checkpoint logic saying "how long since the last one? If longer than limit, do another..." -- I don't think that has changed.)

By the way, do we know what the checkpoint behaviour of HadAM4 and OpenIFS is/will be???

Cheers - Al.
ID: 61029 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4341
Credit: 16,497,933
RAC: 6,477
Message 61030 - Posted: 27 Sep 2019, 8:24:00 UTC - in response to Message 61029.  

George's email has been responded to and Sarah is going to look at reducing the checkpointing for future hadcm3s work though probably too late for the rest of the current batch. From memory on my slow machine hadam4 is just over 20minutes so about 4 on a fast machine. I can't remember from testing about the IFS tasks.
ID: 61030 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1055
Credit: 16,516,801
RAC: 955
Message 61036 - Posted: 27 Sep 2019, 12:37:09 UTC - in response to Message 61008.  

I don't usually turn on checkpoint debug in BOINC-Manager,


How does one do that?
ID: 61036 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4341
Credit: 16,497,933
RAC: 6,477
Message 61037 - Posted: 27 Sep 2019, 13:54:34 UTC

I don't usually turn on checkpoint debug in BOINC-Manager,



How does one do that?


Options>Event log options it is one of 26 boxes you can check, some of which I actually understand!
ID: 61037 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,472,181
RAC: 3,628
Message 61044 - Posted: 27 Sep 2019, 18:15:45 UTC - in response to Message 61026.  
Last modified: 27 Sep 2019, 18:16:04 UTC

And I just checked the writes on my Ryzen 3700x, which is running hadcm3s on all 16 of the cores. Since the writes are rather variable, I used iostat 7200, which measures over a two-hour period. The writes were 500 GB/day, too much for me without a write-cache for protection. In fact, my limit is 70 GB/day.

So with a command line of iostat -m 7200, does the following mean this PC is writing 40 GB every 2 hours?

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.50   61.89    2.98    3.32    0.00   31.30

Device             tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
loop0             0.00         0.00         0.00          0          0
loop1             0.00         0.00         0.00          0          0
loop2             0.00         0.00         0.00          0          0
loop3             0.00         0.00         0.00          0          0
loop4             0.00         0.00         0.00          0          0
loop5             0.00         0.00         0.00          0          0
loop6             0.00         0.00         0.00          0          0
sda              94.19         0.00         5.58          0      40171
ID: 61044 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 61046 - Posted: 28 Sep 2019, 2:14:28 UTC - in response to Message 61044.  

So with a command line of iostat -m 7200, does the following mean this PC is writing 40 GB every 2 hours?

Yes.
ID: 61046 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 24,510,982
RAC: 1,493
Message 61047 - Posted: 28 Sep 2019, 11:25:29 UTC
Last modified: 28 Sep 2019, 11:27:07 UTC

In my case it seem 705 GB in 2 hours (4 WUs only). However if I use -m 3600 I get almost the same results (not 1/2 as anticipated)
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.55   49.62    0.37    0.78    0.00   48.67

Device             tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
loop0             0.00         0.00         0.00          2          0
loop1             0.00         0.00         0.00          1          0
loop2             0.00         0.00         0.00          0          0
loop3             0.00         0.00         0.00          0          0
loop4             0.00         0.00         0.00          2          0
loop5             0.00         0.00         0.00          0          0
loop6             0.10         0.00         0.00         43          0
loop7             0.00         0.00         0.00          0          0
sda              31.17         0.03         1.68      13406     705603
sdb              71.77         1.15         0.00     483446       1304
ID: 61047 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 61048 - Posted: 28 Sep 2019, 14:45:30 UTC - in response to Message 61047.  

I just did a 12-hour test on my Ryzen 3770x, and the result is 247 GB, or 494 GB/day, which is consistent with my earlier 2-hour test below.

jim@Ryzen3700X:~$ iostat -m 43200
Linux 5.0.0-29-generic (Ryzen3700X) 	09/27/2019 	_x86_64_	(16 CPU)

Device             tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
loop0             0.00         0.00         0.00          0          0
loop1             0.00         0.00         0.00          0          0
loop2             0.00         0.00         0.00          0          0
loop3             0.00         0.00         0.00          0          0
loop4             0.00         0.00         0.00          0          0
loop5             0.00         0.00         0.00          0          0
loop6             0.00         0.00         0.00          0          0
loop7             0.00         0.00         0.00          0          0
sda             102.32         0.00         5.72         21     247023
loop8             0.00         0.00         0.00          0          0
loop9             0.00         0.00         0.00          0          0
loop10            0.00         0.00         0.00          0          0
loop11            0.00         0.00         0.00          0          0
loop12            0.00         0.00         0.00          0          0
loop13            0.00         0.00         0.00          0          0
loop14            0.00         0.00         0.00          0          0


So bernard_ivo I don't know why your 1 hour test did not give the expected results.
I assume the same number of cores were operating (and not on hold), but if there were any downloads occurring, that would add to the writes also.
Try it again; I think it will work as expected eventually.
ID: 61048 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 24,510,982
RAC: 1,493
Message 61050 - Posted: 28 Sep 2019, 15:53:36 UTC - in response to Message 61048.  

Thanks Jim. Any downloads are on the sdb drive. I have just restarted the machine, to test. Am I using the iostat correctly though? I suppose to value the last 1 h i need to let the machine run for at least an hour and then run the command with -m 3600. I read the man page but still not sure whether I understand it correctly. And why after executing the command I need to stop it to get back to command line?
ID: 61050 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1055
Credit: 16,516,801
RAC: 955
Message 61051 - Posted: 28 Sep 2019, 15:59:54 UTC - in response to Message 61037.  

Does not exist for me. Running 7.2.33 which is the latest one supported for my Linus distro.
ID: 61051 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1055
Credit: 16,516,801
RAC: 955
Message 61052 - Posted: 28 Sep 2019, 16:10:44 UTC - in response to Message 61044.  

On my machine, /dev/sdd is the one with the boinc partition on it. There is also a partition there with some videos, but I watch them sometimes, but rarely write them.
/dev/sde is a removable hard drive for backups that is currently plugged mounted.

I have a 4-core processor that has been running two CPDN tasks for about 3 days. 1 rosetta and one WCG.

$ iostat -m 7200
Linux 2.6.32-754.23.1.el6.x86_64 (DellT7600.localdomain) 	09/28/2019 	_x86_64_	(4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5.27   93.56    1.02    0.05    0.00    0.09

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sdd               9.10         0.01         0.38       5052     154425
sdb               9.69         0.33         0.05     133111      19124
sda               0.00         0.00         0.00          2          0
sdc               0.02         0.00         0.00        538         79
sde               3.09         0.00         0.31        205     126202

ID: 61052 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4341
Credit: 16,497,933
RAC: 6,477
Message 61053 - Posted: 28 Sep 2019, 17:19:53 UTC - in response to Message 61051.  

Does not exist for me. Running 7.2.33 which is the latest one supported for my Linus distro.


The actual package is sysstat which includes the iostat command.
ID: 61053 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 61054 - Posted: 28 Sep 2019, 17:24:36 UTC - in response to Message 61050.  
Last modified: 28 Sep 2019, 17:27:49 UTC

Am I using the iostat correctly though? I suppose to value the last 1 h i need to let the machine run for at least an hour and then run the command with -m 3600. I read the man page but still not sure whether I understand it correctly. And why after executing the command I need to stop it to get back to command line?

It looks good to me. If you want another command line, you can just open another window and let iostat keep running if you want to.
And if you have read the manual, you know more about it that I do. I have just used it a lot, but am not an expert.

PS - Yes, I neglected to mention that you have to install sysstat first to use iostat.
sudo apt install sysstat
ID: 61054 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 438
Credit: 24,510,982
RAC: 1,493
Message 61055 - Posted: 28 Sep 2019, 17:47:09 UTC - in response to Message 61054.  
Last modified: 28 Sep 2019, 17:47:32 UTC


It looks good to me. If you want another command line, you can just open another window and let iostat keep running if you want to.
And if you have read the manual, you know more about it that I do. I have just used it a lot, but am not an expert.


I just wondered why it does not exit after executing unless there is a reason to keep it running. It seems after a given interval 10 (10s) it generates a report for this interval. So after letting it run 3600 (1h) I will get a second report with the data read/written within this hour. The first report (on executing the command) seems to be all data since boot, hence my over 700 GB was not for 2h but since last boot.
ID: 61055 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 61056 - Posted: 28 Sep 2019, 18:16:35 UTC - in response to Message 61055.  

The first report (on executing the command) seems to be all data since boot, hence my over 700 GB was not for 2h but since last boot.

That is my interpretation too. I ignore the first value, since I want the timed value. Thanks for pointing this out.
ID: 61056 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Excessive checkpointing on new Linux hadcm3s tasks?

©2024 climateprediction.net