climateprediction.net home page
Thought on improving model robustness

Thought on improving model robustness

Questions and Answers : Unix/Linux : Thought on improving model robustness
Message board moderation

To post messages, you must log in.

AuthorMessage
Steve Bergman

Send message
Joined: 5 Aug 08
Posts: 22
Credit: 501,217
RAC: 0
Message 34646 - Posted: 15 Aug 2008, 17:48:35 UTC

I\'m fairly new to cpdn, so keep that in mind. But one thing that I have done to try to ensure robustness is to use the extended \'j\' attribute, supported by the ext3 filesystem, on my BOINC subtree.

For anyone who is not aware, ext3 has 3 levels of journaling, which give the different sets of guarantees for data integrity.

data=writeback:

This is what most journaling filesystems implement, and is the fastest mode. It journals the metadata and obviates the need for an fsck after an unplanned shutdown. The filesystem structure is guaranteed to be consistent. But some data blocks could contain garbage.


data=ordered:

This is the ext3 default. It entails some performance penalty relative to \"writeback\". Like \"writeback\", it only journals the metadata, but it orders writes carefully in order to guarantee that no data blocks can contain garbage. Files may not have the very *latest* data. But they will be what the file looked like some seconds before the crash. It is possible for two related files to be inconsistent with each other if one has the latest data and the other does not.


data=journal:

This is the most robust mode. It incurs the greatest performance penalty. It journals both the metadata and the data. It does not tell the requesting program that the data has been written until it has actually been written to the journal. This mode guarantees that if the calling program was told that the data was written to disk, it absolutely will be there after a crash.

Since journaling does incur a substantial performance penalty, most people do not find it worthwhile. However, it is possible to tell ext3 to journal the data for individual files. This is done with the \'j\' attribute.

For example, if one has a file \"important.db\", one can do:

chattr +j important.db

and from then on, that file will have its data jounaled, regardless of what mode the filesystem is mounted in. The -R option to chattr makes it recursive:

cd ~steve
chattr -R +j BOINC

will ensure that everything related to our (very long running) BOINC processes has full data integrity guarantees. (At least at the filesystem level.)

And, of course, you can view the attributes with \'lsattr\' which works very much like ls.

I just wanted to mention this and get any feedback anyone might care to give.
ID: 34646 · Report as offensive     Reply Quote
Steve Bergman

Send message
Joined: 5 Aug 08
Posts: 22
Credit: 501,217
RAC: 0
Message 34711 - Posted: 20 Aug 2008, 20:24:19 UTC - in response to Message 34646.  
Last modified: 20 Aug 2008, 20:42:39 UTC

After thinking about this some more, an obvious question occurred to me. Does the model call fsync to force writing the file\'s updated blocks to disk after writes? This would accomplish the same thing programmatically, reducing the possibility of corruption after an unplanned shutdown.

Edit:

To answer my own question, I monitored my CM3 model with \'strace\', which shows me all of the system calls that the model makes to the operating system kernel. It appears that it does *not* use fsync after writes. This would, I believe, be a way to improve the success rate of future models.
ID: 34711 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 34715 - Posted: 21 Aug 2008, 22:26:08 UTC

CPDN used to hammer HDD. Carl put a lot of effort into reducing HDD I/O and got rid of ~91%, at a small cost in RAM requirement. I don\'t think Run stability suffered. It\'s a matter of trade-offs, eh?
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 34715 · Report as offensive     Reply Quote
Steve Bergman

Send message
Joined: 5 Aug 08
Posts: 22
Credit: 501,217
RAC: 0
Message 34716 - Posted: 22 Aug 2008, 0:21:45 UTC - in response to Message 34715.  
Last modified: 22 Aug 2008, 1:21:33 UTC

CPDN used to hammer HDD. Carl put a lot of effort into reducing HDD I/O and got rid of ~91%, at a small cost in RAM requirement. I don\'t think Run stability suffered. It\'s a matter of trade-offs, eh?


I don\'t really see a substantial trade off in this case. Calling fsync might increase write overhead by, say, 10%. That\'s off the top of my head and subject to debate. But I think it\'s a reasonable ballpark figure. It all goes through the OS\'s page cache and the OS\'s (probably elevator) I/O scheduling algorithms.

On the other hand, it is not a panacea. Basically, what fsync gives the app is the ability to know what has been written to disk and what may or may not have been written. It is still up to the application to \"do the right thing\". If a write is requested, and the return code from the fsync is good, the app can know what to do. If the return code is not good it can, absolutely, know what it must do to prevent the possibility of a crashed model. It gives an absolute guarantee to the app that certain things have, effectively, happened. Without that, the app has to assume that the write actually happened. And it might not have happened at all.

It is not exactly an esoteric or performance killing facility. Databases like PostgreSQL and MySQL use it to help insure data integrity. Windows, no doubt, has a counterpart system call.

It is a necessary, but not sufficient, condition to ensure data integrity.

I am not intimately familiar with the I/O performance issues that CPDN may have had with certain models. But I sincerely do not see exercising a bit of control over the timing of physical writes, in this context, to be any sort of return to those difficulties about which I have heard.

It sounds like disk writes are, perhaps, already being collected up into logically related bundles, for performance reasons. That would be a perfect fit for fsync. Because the real danger a programmer faces is when only a *partial* write of a logically related bundle of data occurs. One wants to know if it happened, or if it might not have. Or, most importantly, if only *part* of it might have happened.

I\'m not an expert in this area. (And it has been 26 years since I have written in Fortran!) But I do know my way around a bit. I, personally, think there is some promise to improve the success rate of these very long-running models, here.
ID: 34716 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 34721 - Posted: 22 Aug 2008, 10:01:00 UTC - in response to Message 34716.  

CPDN used to hammer HDD. Carl put a lot of effort into reducing HDD I/O and got rid of ~91%, at a small cost in RAM requirement. I don\'t think Run stability suffered. It\'s a matter of trade-offs, eh?

... I am not intimately familiar with the I/O performance issues that CPDN may have had with certain models. ...
Hi Steve: If memory serves, the changes were made to reduce hard disk activity rather than to improve performance - specifically to reduce wear and heat on laptops. (The project is keen not to trash or incinerate PCs!) The \'optimised\' version is actually slower than the previous version ...

... and that\'s not to say that fsync (or ye olde FORTRAN equivalent) isn\'t a good idea. The model checkpoints and some versions have other restart dumps as well. I agree with you that if the application could be confident that the checkpointed data is reliable then it ought to be able to restart more often than it does. Models will still fail when the saved data is reliable, but junk - because the model has been corrupted in some other way.
ID: 34721 · Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : Thought on improving model robustness

©2024 climateprediction.net