Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileStream rewrite part II #48813

Merged
merged 31 commits into from
Mar 17, 2021

Conversation

adamsitnik
Copy link
Member

@adamsitnik adamsitnik commented Feb 26, 2021

Main changes:

  1. Two new Windows strategies: AsyncWindowsFileStreamStrategy and SyncWindowsFileStreamStrategy.
  2. The code that has been common for Legacy and new strategies has been moved to FileStreamHelpers.Windows (to reduce code duplication and minimize assembly size growth)
  3. The code that has been common for the new Async and Sync strategies (Position, Length etc) has been moved to WindowsFileStreamStrategy (a base abstract type). The Legacy strategies were not changed.
  4. The buffering logic has been removed from the strategies and a new strategy called BufferedFileStreamStrategy has been introduced. It wraps the actual strategy (FileStreamStrategy : Stream), with BufferedStream
  5. BufferedStream has been adopted to meet FileStream requirements like allowing for async 0 bytes blocking reads and some perf improvements.

By replacing the custom buffering logic with BufferedStream, I was able to get #16341 and #27643 fixed out of the box.
The code has also become much simpler, as Sync and Async strategies are now just managed wrappers for native syscalls.

The disadvantage is that BufferedStream acquires a lock to solve #16341 and #27643 and as a side-effect, it does not perform async reads and writes in parallel which was kind of supported until now. This is a breaking change, and that is why I've not enabled the new strategy by default. I would prefer to implement #24847 to allow users for truly thread-safe async reads and writes (set of methods that accepts offsets), and get #16354 and #25905 fixed, then test few products against it (clone dotnet/sdk set the env var, run all their tests), write a blog post and enable it for Preview 4. So we would give the users an explanation, better perf, and a way to mitigate breaking changes.

fixes #27643
fixes #16341

Perf numbers:

Method Toolchain fileSize userBufferSize options Mean Ratio Allocated
Read \6.0.0\CoreRun.exe 1024 4096 None 52.31 us 0.99 208 B
Read \before\CoreRun.exe 1024 4096 None 52.59 us 1.00 208 B
Write \after\CoreRun.exe 1024 4096 None 189.62 us 1.00 208 B
Write \before\CoreRun.exe 1024 4096 None 190.18 us 1.00 208 B
ReadAsync \after\CoreRun.exe 1024 4096 None 65.94 us 0.98 744 B
ReadAsync \before\CoreRun.exe 1024 4096 None 67.45 us 1.00 656 B
WriteAsync \after\CoreRun.exe 1024 4096 None 191.62 us 1.00 208 B
WriteAsync \before\CoreRun.exe 1024 4096 None 192.10 us 1.00 208 B
ReadAsync \after\CoreRun.exe 1024 4096 Asynchronous 86.09 us 1.00 904 B
ReadAsync \before\CoreRun.exe 1024 4096 Asynchronous 86.06 us 1.00 800 B
WriteAsync \after\CoreRun.exe 1024 4096 Asynchronous 194.77 us 1.00 256 B
WriteAsync \before\CoreRun.exe 1024 4096 Asynchronous 194.37 us 1.00 240 B
OpenClose \after\CoreRun.exe 1024 ? None 45.64 us 0.99 208 B
OpenClose \before\CoreRun.exe 1024 ? None 45.92 us 1.00 208 B
LockUnlock \after\CoreRun.exe 1024 ? None 85.73 us 0.98 208 B
LockUnlock \before\CoreRun.exe 1024 ? None 87.14 us 1.00 208 B
SeekForward \after\CoreRun.exe 1024 ? None 264.87 us 1.02 208 B
SeekForward \before\CoreRun.exe 1024 ? None 260.04 us 1.00 208 B
SeekBackward \after\CoreRun.exe 1024 ? None 1,630.46 us 1.02 209 B
SeekBackward \before\CoreRun.exe 1024 ? None 1,601.48 us 1.00 209 B
ReadByte \after\CoreRun.exe 1024 ? None 55.60 us 0.93 4,328 B
ReadByte \before\CoreRun.exe 1024 ? None 59.82 us 1.00 4,328 B
WriteByte \after\CoreRun.exe 1024 ? None 1,009.55 us 1.01 4,329 B
WriteByte \before\CoreRun.exe 1024 ? None 1,004.64 us 1.00 4,329 B
Flush \after\CoreRun.exe 1024 ? None 3,845.19 us 1.01 4,330 B
Flush \before\CoreRun.exe 1024 ? None 3,810.89 us 1.00 4,330 B
FlushAsync \after\CoreRun.exe 1024 ? None 6,602.52 us 1.69 275,051 B
FlushAsync \before\CoreRun.exe 1024 ? None 3,913.10 us 1.00 4,330 B
CopyToFile \after\CoreRun.exe 1024 ? None 1,027.57 us 0.98 4,538 B
CopyToFile \before\CoreRun.exe 1024 ? None 1,044.95 us 1.00 8,658 B
CopyToFileAsync \after\CoreRun.exe 1024 ? None 1,087.54 us 1.04 5,562 B
CopyToFileAsync \before\CoreRun.exe 1024 ? None 1,051.26 us 1.00 9,546 B
OpenClose \after\CoreRun.exe 1024 ? Asynchronous 47.25 us 1.02 256 B
OpenClose \before\CoreRun.exe 1024 ? Asynchronous 46.33 us 1.00 240 B
LockUnlock \after\CoreRun.exe 1024 ? Asynchronous 90.60 us 1.00 256 B
LockUnlock \before\CoreRun.exe 1024 ? Asynchronous 90.74 us 1.00 240 B
SeekForward \after\CoreRun.exe 1024 ? Asynchronous 2,267.38 us 1.00 257 B
SeekForward \before\CoreRun.exe 1024 ? Asynchronous 2,274.21 us 1.00 241 B
SeekBackward \after\CoreRun.exe 1024 ? Asynchronous 4,881.20 us 1.00 258 B
SeekBackward \before\CoreRun.exe 1024 ? Asynchronous 4,897.97 us 1.00 242 B
ReadByte \after\CoreRun.exe 1024 ? Asynchronous 74.84 us 0.95 4,696 B
ReadByte \before\CoreRun.exe 1024 ? Asynchronous 79.16 us 1.00 4,680 B
WriteByte \after\CoreRun.exe 1024 ? Asynchronous 1,100.21 us 1.06 4,750 B
WriteByte \before\CoreRun.exe 1024 ? Asynchronous 1,038.08 us 1.00 4,735 B
Flush \after\CoreRun.exe 1024 ? Asynchronous 30,805.82 us 1.71 152,146 B
Flush \before\CoreRun.exe 1024 ? Asynchronous 17,981.91 us 1.00 152,213 B
FlushAsync \after\CoreRun.exe 1024 ? Asynchronous 34,242.24 us 1.89 307,973 B
FlushAsync \before\CoreRun.exe 1024 ? Asynchronous 18,101.38 us 1.00 152,067 B
CopyToFileAsync \after\CoreRun.exe 1024 ? Asynchronous 1,178.23 us 1.01 6,154 B
CopyToFileAsync \before\CoreRun.exe 1024 ? Asynchronous 1,163.50 us 1.00 9,955 B
Read \after\CoreRun.exe 1048576 512 None 722.67 us 1.01 4,328 B
Read \before\CoreRun.exe 1048576 512 None 717.40 us 1.00 4,328 B
Write \after\CoreRun.exe 1048576 512 None 5,902.44 us 1.00 4,331 B
Write \before\CoreRun.exe 1048576 512 None 5,898.69 us 1.00 4,331 B
ReadAsync \after\CoreRun.exe 1048576 512 None 1,687.85 us 0.50 86,673 B
ReadAsync \before\CoreRun.exe 1048576 512 None 3,408.43 us 1.00 234,051 B
WriteAsync \after\CoreRun.exe 1048576 512 None 7,385.12 us 0.77 78,188 B
WriteAsync \before\CoreRun.exe 1048576 512 None 9,646.79 us 1.00 234,038 B
ReadAsync \after\CoreRun.exe 1048576 512 Asynchronous 4,992.81 us 1.24 95,002 B
ReadAsync \before\CoreRun.exe 1048576 512 Asynchronous 4,020.74 us 1.00 41,488 B
WriteAsync \after\CoreRun.exe 1048576 512 Asynchronous 17,239.84 us 1.67 86,622 B
WriteAsync \before\CoreRun.exe 1048576 512 Asynchronous 10,351.44 us 1.00 41,646 B
Read \after\CoreRun.exe 1048576 4096 None 675.57 us 1.00 208 B
Read \before\CoreRun.exe 1048576 4096 None 677.16 us 1.00 208 B
Write \after\CoreRun.exe 1048576 4096 None 5,939.09 us 1.00 211 B
Write \before\CoreRun.exe 1048576 4096 None 5,916.73 us 1.00 211 B
ReadAsync \after\CoreRun.exe 1048576 4096 None 1,033.97 us 0.99 29,305 B
ReadAsync \before\CoreRun.exe 1048576 4096 None 1,042.15 us 1.00 29,217 B
WriteAsync \after\CoreRun.exe 1048576 4096 None 7,169.78 us 1.07 55,948 B
WriteAsync \before\CoreRun.exe 1048576 4096 None 6,700.28 us 1.00 29,212 B
ReadAsync \after\CoreRun.exe 1048576 4096 Asynchronous 4,845.40 us 0.99 80,466 B
ReadAsync \before\CoreRun.exe 1048576 4096 Asynchronous 4,880.56 us 1.00 80,362 B
WriteAsync \after\CoreRun.exe 1048576 4096 Asynchronous 17,092.40 us 1.01 85,786 B
WriteAsync \before\CoreRun.exe 1048576 4096 Asynchronous 16,924.07 us 1.00 80,362 B
ReadAsync_NoBuffering \after\CoreRun.exe 1048576 16384 None 412.63 us 0.89 7,648 B
ReadAsync_NoBuffering \before\CoreRun.exe 1048576 16384 None 464.17 us 1.00 7,713 B
WriteAsync_NoBuffering \after\CoreRun.exe 1048576 16384 None 26,331.27 us 1.00 7,656 B
WriteAsync_NoBuffering \before\CoreRun.exe 1048576 16384 None 26,324.59 us 1.00 7,720 B
ReadAsync_NoBuffering \after\CoreRun.exe 1048576 16384 Asynchronous 1,343.94 us 1.01 20,409 B
ReadAsync_NoBuffering \before\CoreRun.exe 1048576 16384 Asynchronous 1,328.15 us 1.00 20,457 B
WriteAsync_NoBuffering \after\CoreRun.exe 1048576 16384 Asynchronous 28,778.20 us 1.00 20,418 B
WriteAsync_NoBuffering \before\CoreRun.exe 1048576 16384 Asynchronous 28,864.73 us 1.00 20,466 B
CopyToFile \after\CoreRun.exe 1048576 ? None 4,802.88 us 1.00 423 B
CopyToFile \before\CoreRun.exe 1048576 ? None 4,809.57 us 1.00 423 B
CopyToFileAsync \after\CoreRun.exe 1048576 ? None 5,025.95 us 1.00 3,217 B
CopyToFileAsync \before\CoreRun.exe 1048576 ? None 5,036.89 us 1.00 2,887 B
CopyToFileAsync \after\CoreRun.exe 1048576 ? Asynchronous 5,537.49 us 1.01 4,217 B
CopyToFileAsync \before\CoreRun.exe 1048576 ? Asynchronous 5,476.47 us 1.00 8,001 B
Read \after\CoreRun.exe 104857600 4096 None 82,165.12 us 1.00 244 B
Read \before\CoreRun.exe 104857600 4096 None 82,227.80 us 1.00 244 B
Write \after\CoreRun.exe 104857600 4096 None 246,483.14 us 1.00 352 B
Write \before\CoreRun.exe 104857600 4096 None 250,876.59 us 1.00 352 B
ReadAsync \after\CoreRun.exe 104857600 4096 None 150,722.10 us 1.04 2,867,976 B
ReadAsync \before\CoreRun.exe 104857600 4096 None 145,035.38 us 1.00 2,867,888 B
WriteAsync \after\CoreRun.exe 104857600 4096 None 355,373.70 us 1.13 5,124,928 B
WriteAsync \before\CoreRun.exe 104857600 4096 None 316,042.83 us 1.00 2,867,920 B
ReadAsync \after\CoreRun.exe 104857600 4096 Asynchronous 479,619.12 us 0.98 7,987,936 B
ReadAsync \before\CoreRun.exe 104857600 4096 Asynchronous 489,108.02 us 1.00 7,987,832 B
WriteAsync \after\CoreRun.exe 104857600 4096 Asynchronous 1,914,460.08 us 1.01 8,094,624 B
WriteAsync \before\CoreRun.exe 104857600 4096 Asynchronous 1,900,086.97 us 1.00 7,987,824 B
ReadAsync_NoBuffering \after\CoreRun.exe 104857600 16384 None 50,222.6 us 1.03 701 KB
ReadAsync_NoBuffering \before\CoreRun.exe 104857600 16384 None 48,887.9 us 1.00 701 KB
WriteAsync_NoBuffering \after\CoreRun.exe 104857600 16384 None 107,467.87 us 1.01 717,344 B
WriteAsync_NoBuffering \before\CoreRun.exe 104857600 16384 None 107,064.61 us 1.00 717,408 B
ReadAsync_NoBuffering \after\CoreRun.exe 104857600 16384 Asynchronous 138,601.73 us 0.98 1,997,312 B
ReadAsync_NoBuffering \before\CoreRun.exe 104857600 16384 Asynchronous 141,456.15 us 1.00 1,997,424 B
WriteAsync_NoBuffering \after\CoreRun.exe 104857600 16384 Asynchronous 551,033.71 us 1.03 1,997,376 B
WriteAsync_NoBuffering \before\CoreRun.exe 104857600 16384 Asynchronous 537,239.34 us 1.00 1,997,424 B
CopyToFile \after\CoreRun.exe 104857600 ? None 76,699.79 us 1.00 524 B
CopyToFile \before\CoreRun.exe 104857600 ? None 76,463.15 us 1.00 524 B
CopyToFileAsync \after\CoreRun.exe 104857600 ? None 82,379.22 us 0.99 180,796 B
CopyToFileAsync \before\CoreRun.exe 104857600 ? None 83,225.34 us 1.00 180,460 B
CopyToFileAsync \after\CoreRun.exe 104857600 ? Asynchronous 156,041.68 us 1.01 252,032 B
CopyToFileAsync \before\CoreRun.exe 104857600 ? Asynchronous 155,232.32 us 1.00 255,312 B

Comment on perf:

  • Most of the benchmarks are within the range of error (Ratio column shows 0.98-1.02)
  • FlushAsync is now much more expensive (the benchmark which is quite artificial as it writes a single byte in a loop and flushes shows x2 regression). So far the buffer was getting flushed in a synchronous way. This is the price we pay for async. I don't expect it to be a problem as nobody should be calling FlushAsync in a loop like this benchmark does.
  • Using ReadAsync with a small user buffer size (stream buffer size / 8) has regressed by 18%, but again this is the cost of performing the read into the buffer in an async way (so far it was always synchronous).
  • I expect the two above regressions to get erased by the upcoming improvements (see Keep tracking the file offset in memory, don't perform expensive Seek calls on every Read|WriteAsync #49145 for some nice perf wins)

Base automatically changed from master to main March 1, 2021 09:08
@adamsitnik adamsitnik changed the title [DRAFT] BufferedFileStreamStrategy FileStream rewrite part II Mar 3, 2021
@adamsitnik adamsitnik marked this pull request as ready for review March 3, 2021 15:45
@adamsitnik adamsitnik added this to the 6.0.0 milestone Mar 3, 2021
@adamsitnik
Copy link
Member Author

@carlossanlop @jozkee @stephentoub PTAL ;)

@danmoseley
Copy link
Member

write a blog post and enable it for Preview 4

I like this part - it’d be great for our preview users to help us validate as we go along.

@adamsitnik
Copy link
Member Author

adamsitnik commented Mar 10, 2021

@carlossanlop @jozkee @stephentoub first of all, big thanks for reviewing the PR and sharing a lot of very valuable feedback with me!

I think that I've answered all the questions and addressed most of the concerns (3f6b541)

However, two important things remain open:

  • Should I stop trying to accommodate BufferedStream to meet FileStream requirements and move the buffering logic to BufferedFileStreamStrategy? It would increase the code duplication but would keep BufferedStream unchanged. What do you prefer? It could be more future proof (easier to introduce changes specific to file stream) and more performant (avoid an allocation of extra reference type, one less layer of abstraction).
  • Should we introduce another breaking change and don't try to Flush the buffer in the finalizer? See FileStream rewrite part II #48813 (comment) for the full discussion. Personally, I am not a big fan of this, we could at least keep the old behaviour for files opened for sync IO.

@jozkee
Copy link
Member

jozkee commented Mar 10, 2021

Should I stop trying to accommodate BufferedStream to meet FileStream requirements and move the buffering logic to BufferedFileStreamStrategy?

I think it is worth switching if there's good value on doing so. Do you think that we could we use the benchmark results to make the decision?

@stephentoub
Copy link
Member

stephentoub commented Mar 10, 2021

Should I stop trying to accommodate BufferedStream to meet FileStream requirements and move the buffering logic to BufferedFileStreamStrategy?

I suggest we start by putting the buffering logic in BufferedFileStreamStrategy. This still enables us to avoid all the buffering duplication we had across aspects of the FileStream implementation and not clutter up BufferedStream. We can then subsequently see if we can consolidate BufferedFileStreamStrategy with BufferedStream, and what new APIs on BufferedStream we might need to do so without contorting an internal contract.

Should we introduce another breaking change and don't try to Flush the buffer in the finalizer? See #48813 (comment) for the full discussion. Personally, I am not a big fan of this, we could at least keep the old behaviour for files opened for sync IO.

I'm ok if you want to keep the old behavior, at least for now. I think the behavior is broken, though. And we shouldn't try to strengthen the old behavior in any way.

@adamsitnik
Copy link
Member Author

@stephentoub @jozkee @carlossanlop I've taken the best of BufferedStream (100% async IO, but using a semaphore) and LegacyFileStream (special handling for pipes) and extended BufferedFileStreamStrategy with the buffering logic. All tests are passing. PTAL

Copy link
Member

@jozkee jozkee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

I couldn't find anything that would block merging, only feedback I have is a bunch of questions and possible concerns that could be addressed in follow-up PRs.

Thanks @adamsitnik!

}
else
{
throw Win32Marshal.GetExceptionForWin32Error(errorCode, _path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use a throw helper not to achieve inlining but to be consistent with other exceptions.

}
else
{
throw Win32Marshal.GetExceptionForWin32Error(errorCode, _path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use throw helper.

if (errorCode == ERROR_INVALID_PARAMETER)
throw new ArgumentException(SR.Arg_HandleNotSync, "_fileHandle");

throw Win32Marshal.GetExceptionForWin32Error(errorCode, _path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

// where the position is too large or for synchronous writes
// to a handle opened asynchronously.
if (errorCode == ERROR_INVALID_PARAMETER)
throw new IOException(SR.IO_FileTooLongOrHandleNotSync);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.

@adamsitnik
Copy link
Member Author

adamsitnik commented Mar 17, 2021

@jozkee big thanks for the review! I am going to address the refactoring suggestions in a separate PR as soon as I merge this one. (edit: link to the PR: #49750)

@carlossanlop @stephentoub Since the new strategy is disabled by default, I am going to merge this PR to unblock #49145 and #49638 and also make the memory investigation easier. If you have any feedback, I am going to address it in a separate PR.

@adamsitnik adamsitnik merged commit 6ef4b2e into dotnet:main Mar 17, 2021
@adamsitnik adamsitnik deleted the newWindowsFileStreamStrategy branch March 17, 2021 18:52
@adamsitnik adamsitnik added the breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. label Apr 7, 2021
@ghost ghost added the needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet label Apr 7, 2021
@ghost ghost locked as resolved and limited conversation to collaborators May 7, 2021
@adamsitnik
Copy link
Member Author

Breaking change doc: dotnet/docs#24060

@adamsitnik adamsitnik removed the needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet label Oct 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO breaking-change Issue or PR that represents a breaking API or functional change over a prerelease.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FileStream.FlushAsync ends up doing synchronous writes Win32 FileStream turns async reads into sync reads
10 participants