[arch-general] NFS close-to-open
I've been learning about the close-to-open policy of NFS, which causes each file to be flushed to the server as it's closed, to ensure consistency across clients. This is a big performance hit when trying to upgrade the system, because of the numerous writes to small files. I know about the "async" export option, obviously, but there is also a "nocto" client mount option, which is supposed to disable the close-to-open mechanism for that client. As far as I can tell, this is supposed to stop the client from flushing the file when it closes it (as well as not bothering to check cache consistency when it opens). However, this seems to have no effect: the client is still flushing the file to the server on close, causing massive wait-io. Does anyone have any idea why "nocto" doesn't have the effect I was hoping it would? The "async" option works as expected, but it's more important to me that the client cache is correct, and it just bugs me. Thanks for any insight, Paul
On 12/10/2012 08:11 AM, Paul Gideon Dann wrote:
I've been learning about the close-to-open policy of NFS, which causes each file to be flushed to the server as it's closed, to ensure consistency across clients. This is a big performance hit when trying to upgrade the system, because of the numerous writes to small files.
I know about the "async" export option, obviously, but there is also a "nocto" client mount option, which is supposed to disable the close-to-open mechanism for that client. As far as I can tell, this is supposed to stop the client from flushing the file when it closes it (as well as not bothering to check cache consistency when it opens). However, this seems to have no effect: the client is still flushing the file to the server on close, causing massive wait-io.
Does anyone have any idea why "nocto" doesn't have the effect I was hoping it would? The "async" option works as expected, but it's more important to me that the client cache is correct, and it just bugs me. What NFS version are you using? I just took a quick glance through the code and from what I can tell, v2 and v3 check for the "nocto" flag but v4 doesn't. I'm not sure if this is an oversight or by design, but I can ask later today.
- Bryan
Thanks for any insight, Paul
On Monday 10 Dec 2012 09:26:51 Bryan Schumaker wrote:
On 12/10/2012 08:11 AM, Paul Gideon Dann wrote:
Does anyone have any idea why "nocto" doesn't have the effect I was hoping it would? The "async" option works as expected, but it's more important to me that the client cache is correct, and it just bugs me.
What NFS version are you using? I just took a quick glance through the code and from what I can tell, v2 and v3 check for the "nocto" flag but v4 doesn't. I'm not sure if this is an oversight or by design, but I can ask later today.
I'm using v3. This is a diskless Beowulk cluster, and mkinitcpio-net-utils doesn't boot from v4 yet. Paul
On 12/10/2012 10:23 AM, Paul Gideon Dann wrote:
On Monday 10 Dec 2012 09:26:51 Bryan Schumaker wrote:
On 12/10/2012 08:11 AM, Paul Gideon Dann wrote:
Does anyone have any idea why "nocto" doesn't have the effect I was hoping it would? The "async" option works as expected, but it's more important to me that the client cache is correct, and it just bugs me. What NFS version are you using? I just took a quick glance through the code and from what I can tell, v2 and v3 check for the "nocto" flag but v4 doesn't. I'm not sure if this is an oversight or by design, but I can ask later today. I'm using v3. This is a diskless Beowulk cluster, and mkinitcpio-net-utils doesn't boot from v4 yet.
Paul
After thinking on this some more and going through the code again, I think that the "nocto" only affects writing and fetching attributes for a file (the man page mentions using a heuristic to determine if files have changed on the server). I think the VFS is in charge of scheduling the flush, so we don't know if it's being called for close(), fsync() or any other reason. The nocto flag is checked around calls to __nfs_revalidate_inode(), which is used to update information about a file. - Bryan
On Tuesday 11 Dec 2012 09:07:12 Bryan Schumaker wrote:
After thinking on this some more and going through the code again, I think that the "nocto" only affects writing and fetching attributes for a file (the man page mentions using a heuristic to determine if files have changed on the server). I think the VFS is in charge of scheduling the flush, so we don't know if it's being called for close(), fsync() or any other reason. The nocto flag is checked around calls to __nfs_revalidate_inode(), which is used to update information about a file.
I really appreciate you taking the time to look into this, Bryan. So you say you think that with "nocto", it's the VFS layer that's causing the flush on close now? That seems strange to me, since that's not normal behaviour for any filesystem other than NFS. If it's the attributes that are being checked by the heuristic, this also doesn't seem to explain why the data itself is being flushed... Paul
On 12/11/2012 09:55 AM, Paul Gideon Dann wrote:
On Tuesday 11 Dec 2012 09:07:12 Bryan Schumaker wrote:
After thinking on this some more and going through the code again, I think that the "nocto" only affects writing and fetching attributes for a file (the man page mentions using a heuristic to determine if files have changed on the server). I think the VFS is in charge of scheduling the flush, so we don't know if it's being called for close(), fsync() or any other reason. The nocto flag is checked around calls to __nfs_revalidate_inode(), which is used to update information about a file. I really appreciate you taking the time to look into this, Bryan. So you say you think that with "nocto", it's the VFS layer that's causing the flush on close now? That seems strange to me, since that's not normal behaviour for any filesystem other than NFS. If it's the attributes that are being checked by the heuristic, this also doesn't seem to explain why the data itself is being flushed...
Paul No problem! NFS is my day job :). I say "I think" because I've never traced through the close path before, but the parts that I saw didn't seem to flush anything (I'll admit that I still have a lot to learn, so I could be wrong). I'm about to look again because your "That seems strange to me, since that's not normal behaviour" comment makes me think I missed something...
Data being flushed would be the normal behavior of the client using close-to-open, since data is written back to the server on close. Data being flushed makes sense (at least to me) because it doesn't look like we're doing anything special for flush when nocto is set. - Bryan
On 12/11/2012 10:19 AM, Bryan Schumaker wrote:
On 12/11/2012 09:55 AM, Paul Gideon Dann wrote:
On Tuesday 11 Dec 2012 09:07:12 Bryan Schumaker wrote:
After thinking on this some more and going through the code again, I think that the "nocto" only affects writing and fetching attributes for a file (the man page mentions using a heuristic to determine if files have changed on the server). I think the VFS is in charge of scheduling the flush, so we don't know if it's being called for close(), fsync() or any other reason. The nocto flag is checked around calls to __nfs_revalidate_inode(), which is used to update information about a file. I really appreciate you taking the time to look into this, Bryan. So you say you think that with "nocto", it's the VFS layer that's causing the flush on close now? That seems strange to me, since that's not normal behaviour for any filesystem other than NFS. If it's the attributes that are being checked by the heuristic, this also doesn't seem to explain why the data itself is being flushed...
Paul No problem! NFS is my day job :). I say "I think" because I've never traced through the close path before, but the parts that I saw didn't seem to flush anything (I'll admit that I still have a lot to learn, so I could be wrong). I'm about to look again because your "That seems strange to me, since that's not normal behaviour" comment makes me think I missed something...
Data being flushed would be the normal behavior of the client using close-to-open, since data is written back to the server on close. Data being flushed makes sense (at least to me) because it doesn't look like we're doing anything special for flush when nocto is set.
- Bryan
After more poking-about: filp_close() in the VFS calls nfs_file_flush(), which writes the file back to the server. The struct file we're given is freed up shortly after calling flush, so this looks like the last chance we get to write changes (and I don't think we should change that). - Bryan
participants (2)
-
Bryan Schumaker
-
Paul Gideon Dann