Quantcast
Channel: Comments for Clustering and High-Availability
Viewing all articles
Browse latest Browse all 797

re: Troubleshooting Cluster Shared Volume Auto-Pauses – Event 5120

$
0
0

<part 2>

Starting from Windows 8 FilterManger offers a mitigation for the case when section is built by a file system minifilter (for instance antivirus). See documentation on FltRegisterForDataScan/FltCreateSectionForDataScan/FltCloseSectionForDataScan. Basically the idea is that as long as minifilters create sections using these routines, and implement a call back that filter manager can use to tell filter to speedup section close, then filter manager will take care of retrying non-cached IO failures.

If this mitigation does not work for you then you can keep retrying from you filter. In that case I would suggest following

- Do not retry if you see this is a paging IO. MM/CC will retry it for you.

- In general limit number of retries because in worst case scenario you might be in an endless ping-pong with someone trying to page in a page while your non cached IO is trying to throw it away.

- On CSVFS limit number of retries to few when you see in pre-op that you are called recursively in the context of another top-level IO. To detect that you can use IoGetTopLevelIrp in pre-op. On CSV state transitions (blogs.msdn.com/.../10567706.aspx ) it is waiting for all IO to complete, and your retry might be delaying that. If delay is long enough (several minutes) then CSVFS might give up and invalidate volume. If possible then prefer to retry the top-level IO. And again, keep in mind that MM/CC will retry IOs that they originate so you do not need to worry about those.

<end of reply>

Thanks

Vladimir Petter

Microsoft Corporation.


Viewing all articles
Browse latest Browse all 797

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>