Re: [arch-general] NFS close-to-open
On Wed, Dec 12, 2012 at 9:34 AM, Paul Gideon Dann <pdgiddie@gmail.com>wrote:
On 12/12/2012 05:35 AM, Paul Gideon Dann wrote:
On Tuesday 11 Dec 2012 12:08:46 you wrote:
After more poking-about: filp_close() in the VFS calls nfs_file_flush(), which writes the file back to the server. The struct file we're given is freed up shortly after calling flush, so this looks like the last chance we get to write changes (and I don't think we should change that).
Hmm, so is this maybe an implementation limitation? I would expect "nocto" to prevent that flushing, since that seems to be the main
On Wednesday 12 Dec 2012 08:58:52 you wrote: purpose
of "nocto". But maybe it would have been too much work to buffer the flushes, and so every file gets flushed on close anyway?
Paul
Something like that, yeah. "nocto" just applies to the attribute cache, and does not change the flushing behavior.
That's a pity. In that case, it seems the man page is a little misleading, as it at least implies (in the "Data and Metadata Coherence" section) that "nocto" will disable the flushing. Also, at least some other implementations definitely do disable the flushing[1][2].
Do you know if this is a feature that can be requested? Since writes are buffered already (unless mounted "sync"), I can't imagine it could be too difficult to suppress the flush and let the file be written when the next buffer is sent? But then I guess it would have been implemented already if it were that easy.
I double checked with Trond and he agrees that we shouldn't defer the flush because that would cause us to hold the file open for longer than we really should (and it would make NFS sillyrenames more difficult, too). - Bryan
Paul
[1]: http://docs.oracle.com/cd/E19082-01/819-2240/mount-nfs-1m/index.html [2]: http://www.solarisinternals.com/wiki/index.php/File_Systems
On Thursday 13 Dec 2012 15:53:32 Bryan wrote:
I double checked with Trond and he agrees that we shouldn't defer the flush because that would cause us to hold the file open for longer than we really should (and it would make NFS sillyrenames more difficult, too).
I thought that's why the "nocto" documentation (and general guidance) says it's not safe to use this mount option for directories that are shared between several clients? My particular usecase is a remote-mounted /var, which is used only by a single client. I'd like to avoid the flushing bottleneck, but would also like to avoid the danger of the client and server being out of sync if the server falls over and reboots. Am I misunderstanding the stated purpose of "nocto" (preventing close-to-open)? It seemed to fit this scenario perfectly :( Paul P.S.: Sorry, I've realised that I've been accidentally replying directly to Bryan instead of to the mailing list.
On 12/14/2012 04:50 AM, Paul Gideon Dann wrote:
On Thursday 13 Dec 2012 15:53:32 Bryan wrote:
I double checked with Trond and he agrees that we shouldn't defer the flush because that would cause us to hold the file open for longer than we really should (and it would make NFS sillyrenames more difficult, too).
I thought that's why the "nocto" documentation (and general guidance) says it's not safe to use this mount option for directories that are shared between several clients?
Right.
My particular usecase is a remote-mounted /var, which is used only by a single client. I'd like to avoid the flushing bottleneck, but would also like to avoid the danger of the client and server being out of sync if the server falls over and reboots. Am I misunderstanding the stated purpose of "nocto" (preventing close-to-open)? It seemed to fit this scenario perfectly :(
I'm sorry, but I think you are misunderstanding the mount option. "nocto" is used to cut down on getattrs when deciding if a file has changed, and has nothing to do with when writes are sent to the server. The "async" mount option (enabled by default) already delays writes to the server under normal usecases, but we still flush data when files are closed see "The sync mount option" section in nfs(5). What are your other export and mount options for this mount? Maybe you have something else that can be tuned... - Bryan
Paul
P.S.: Sorry, I've realised that I've been accidentally replying directly to Bryan instead of to the mailing list.
On Friday 14 Dec 2012 11:09:59 you wrote:
I'm sorry, but I think you are misunderstanding the mount option. "nocto" is used to cut down on getattrs when deciding if a file has changed, and has nothing to do with when writes are sent to the server.
That seems to be the case for the Linux NFS implementation, but not universally. My question really is whether it might be sensible to request that "nocto" be extended, so that flushing doesn't occur on close. Since other non-Linux implementations[1][2] specifically do disable flushing when "nocto" is enabled, I'm wondering why Linux doesn't. Since "close-to-open" includes the flush-on-close behaviour, it seems intuitive to me that "nocto" should mean "no flush-on-close" too. With "cto" disabled, why does the file need to be flushed? There is no requirement to fulfil. [1] http://docs.oracle.com/cd/E19082-01/819-2240/mount-nfs-1m/index.html [2] http://www.solarisinternals.com/wiki/index.php/File_Systems
The "async" mount option (enabled by default) already delays writes to the server under normal usecases, but we still flush data when files are closed see "The sync mount option" section in nfs(5).
Since nfs-utils 1.0.0, "sync" has been the default option, because "async" is not really a safe option. "async" work fine, and improves performance a *lot*, but it does mean that the client thinks the data is committed to disk when in fact it isn't, which is potentially dangerous. If "nocto" would disable the whole close-to-open behaviour (including the flush-on-close), it would be possible for the client to recover from a server failure, because it would know exactly what hadn't yet been flushed, and the server wouldn't be lying to it about when data is actually safely on disk.
What are your other export and mount options for this mount? Maybe you have something else that can be tuned...
I'm using "async" now, and performance is fine. This is now more of a theoretical correctness question, to be honest. Really, I'd like to understand why the kernel developers chose not to disable flush-on-close when "nocto" is enabled. I suspect it was just to simplify the code, but I feel that disabling flush-on-close would be a more correct solution than "async" for single-client exports. Paul
On 12/17/2012 06:12 AM, Paul Gideon Dann wrote:
On Friday 14 Dec 2012 11:09:59 you wrote:
I'm sorry, but I think you are misunderstanding the mount option. "nocto" is used to cut down on getattrs when deciding if a file has changed, and has nothing to do with when writes are sent to the server.
That seems to be the case for the Linux NFS implementation, but not universally. My question really is whether it might be sensible to request that "nocto" be extended, so that flushing doesn't occur on close. Since other non-Linux implementations[1][2] specifically do disable flushing when "nocto" is enabled, I'm wondering why Linux doesn't. Since "close-to-open" includes the flush-on-close behaviour, it seems intuitive to me that "nocto" should mean "no flush-on-close" too. With "cto" disabled, why does the file need to be flushed? There is no requirement to fulfil.
[1] http://docs.oracle.com/cd/E19082-01/819-2240/mount-nfs-1m/index.html [2] http://www.solarisinternals.com/wiki/index.php/File_Systems
The "async" mount option (enabled by default) already delays writes to the server under normal usecases, but we still flush data when files are closed see "The sync mount option" section in nfs(5).
Since nfs-utils 1.0.0, "sync" has been the default option, because "async" is not really a safe option. "async" work fine, and improves performance a *lot*, but it does mean that the client thinks the data is committed to disk when in fact it isn't, which is potentially dangerous. If "nocto" would disable the whole close-to-open behaviour (including the flush-on-close), it would be possible for the client to recover from a server failure, because it would know exactly what hadn't yet been flushed, and the server wouldn't be lying to it about when data is actually safely on disk.
What are your other export and mount options for this mount? Maybe you have something else that can be tuned...
I'm using "async" now, and performance is fine. This is now more of a theoretical correctness question, to be honest. Really, I'd like to understand why the kernel developers chose not to disable flush-on-close when "nocto" is enabled. I suspect it was just to simplify the code, but I feel that disabling flush-on-close would be a more correct solution than "async" for single-client exports.
This code was written before I joined up, so at this point I can only make guesses about why it was implemented this way. I think I'm the only NFS developer using Arch, so you might get better information asking on linux-nfs@vger.kernel.org. - Bryan
Paul
On Monday 17 Dec 2012 14:35:45 you wrote:
This code was written before I joined up, so at this point I can only make guesses about why it was implemented this way. I think I'm the only NFS developer using Arch, so you might get better information asking on linux-nfs@vger.kernel.org.
Thanks Bryan. I think I'll wait until after Christmas, but that's a good idea. Thanks for your help, Paul
participants (3)
-
bjschuma@gmail.com
-
Bryan Schumaker
-
Paul Gideon Dann