Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native memory leak caused by BlockLiteral BlockDescriptor memory not being collected after a provided callback function is invoked #20503

Closed
enclave-alistair opened this issue Apr 24, 2024 · 6 comments · Fixed by #20556
Labels
bug If an issue is a bug or a pull request a bug fix performance If an issue or pull request is related to performance
Milestone

Comments

@enclave-alistair
Copy link

enclave-alistair commented Apr 24, 2024

In my app (NetworkExtension in this case) I call an Objective-C function (NEPacketTunnelFlow.ReadPackets). This method gets called a lot. Over time, native memory consumption grows steadily, eventually hitting up against the 50MB Network Extension limit.

It took me a long time to figure out what was actually leaking memory, but I've narrowed it down to BlockLiteral, used to pass a callback to the Objective-C method NEPacketTunnelFlow:runPacketWithCompletionHandler.

Inside BlockLiteral.SetupFunctionPointerBlock, AllocHGlobal is used to allocate native memory to contain the Objective-C BlockDescriptor. In my case it's about 96 bytes each time, but it varies based on the size of the trampoline method signature.

The problem is that the BlockDescriptor memory is not always freed after the resulting callback is invoked, leading to the native memory leaking.

I proved this by copying out the trampoline implementation into my own code, and providing a re-usable, pinned BlockLiteral instance to objc_msgSend instead of a fresh one each time, and the issue goes away.

From what I can tell, after the represented callback is invoked, the ref-count on the BlockLiteral is decremented, and the runtime should free that memory, but it isn't doing so consistently.

Steps to Reproduce

This may be specific to NEPacketTunnelFlow:ReadPackets, only because inside the callback passed to the underlying :readPacketsWithCompletionHandler the ReadPackets method is invoked again; i.e. within the call stack of the same [UnmanagedCallersOnly] trampoline method, but can't be sure. In general:

  1. Invoke Xamarin wrapper method that takes a managed callback.
  2. Call that method a lot. In a network extension this can happen hundreds of times per second.
  3. Periodically print the result of os_proc_available_memory() to the log. This will decrease gradually and not recover, but the GC TotalMemory will not increase.
  4. Look at the memory tree capture (I had to do this by attaching to the process in XCode, I have never been able to successfully use Instruments on an extension process without it crashing).
  5. Note a lot of leaked memory references from AllocHGlobal, with the next call in the stack the trampoline method signature.

The issue should be visible in https://github.com/enclave-alistair/dotnet-ios-netextension, but may need some package updates to match latest Xamarin.

Expected Behavior

The memory allocated by BlockLiteral is reliably freed after the provided callback method exits.

Actual Behavior

The memory allocated by BlockLiteral is not consistently freed, leading to a gradual increase in memory usage.

Environment

Version information
.NET SDK:
 Version:           8.0.204
 Commit:            c338c7548c
 Workload version:  8.0.200-manifests.c4df6daf

Runtime Environment:
 OS Name:     Mac OS X
 OS Version:  13.6
 OS Platform: Darwin
 RID:         osx-x64
 Base Path:   /usr/local/share/dotnet/sdk/8.0.204/

.NET workloads installed:
 [maui]
   Installation Source: SDK 8.0.200
   Manifest Version:    8.0.7/8.0.100
   Manifest Path:       /usr/local/share/dotnet/sdk-manifests/8.0.100/microsoft.net.sdk.maui/8.0.7/WorkloadManifest.json
   Install Type:        FileBased

 [mobile-librarybuilder]
   Installation Source: SDK 8.0.200
   Manifest Version:    8.0.4/8.0.100
   Manifest Path:       /usr/local/share/dotnet/sdk-manifests/8.0.100/microsoft.net.workload.mono.toolchain.current/8.0.4/WorkloadManifest.json
   Install Type:        FileBased


Host:
  Version:      8.0.4
  Architecture: x64
  Commit:       2d7eea2529

Example Project

See https://github.com/enclave-alistair/dotnet-ios-netextension

@rolfbjarne
Copy link
Member

This sounds like an interesting issue; I'll try to have a look next week.

I proved this by copying out the trampoline implementation into my own code, and providing a re-usable, pinned BlockLiteral instance to objc_msgSend instead of a fresh one each time, and the issue goes away.

Can you provide the source code for the workaround somehow? Maybe in a different branch in your repository?

@rolfbjarne rolfbjarne added the bug If an issue is a bug or a pull request a bug fix label May 2, 2024
@rolfbjarne
Copy link
Member

I can reproduce this.

@rolfbjarne rolfbjarne added this to the Future milestone May 2, 2024
@rolfbjarne rolfbjarne added the performance If an issue or pull request is related to performance label May 2, 2024
rolfbjarne added a commit to rolfbjarne/xamarin-macios that referenced this issue May 3, 2024
…n#20503.

We're using two different functions to atomically decrement a reference count,
the native `atomic_fetch_sub` and the managed `Interlocked.Decrement`.

Unfortunately the return value is not the same: `atomic_fetch_sub` returns the
original value before the subtraction, while `Interlocked.Decrement` returns
the subtracted value, while our code assumed the functions behaved the same.
This resulted in a memory leak, because we'd incorrectly expect `0` to be
returned from `atomic_fetch_sub` when the reference count reaches zero, and
thus not detect when the descriptor a block should be freed.

The fix is to update the expected return value from `atomic_fetch_sub` to be
`1` instead of `0`.

Fixes xamarin#20503.
@rolfbjarne
Copy link
Member

I found the bug, it turned out to be a simple fix.

Thanks for the great test case btw!

@enclave-alistair
Copy link
Author

enclave-alistair commented May 3, 2024

Oh, great stuff; just came back here to give my workaround example, which I'm guessing you don't need now. Would this have affected all native callbacks? Or just the one that I'm calling inside itself?

@rolfbjarne
Copy link
Member

Oh, great stuff; just came back here to give my workaround example, which I'm guessing you don't need now. Would this have affected all native callbacks? Or just the one that I'm calling inside itself?

Blocks can be copied - and this bug only affects those that have been copied. If a block is copied or not is an implementation detail of the function you call, so you can't really know when that will happen.

@enclave-alistair
Copy link
Author

enclave-alistair commented May 3, 2024

Ah, yes, I know I've seen through research that the callback in this case is copied into a local field in the native NEPacketTunnelFlow if there are no pending packets to read.

Great, thanks for fixing.

rolfbjarne added a commit to rolfbjarne/xamarin-macios that referenced this issue May 6, 2024
…n#20503.

We're using two different functions to atomically decrement a reference count,
the native `atomic_fetch_sub` and the managed `Interlocked.Decrement`.

Unfortunately the return value is not the same: `atomic_fetch_sub` returns the
original value before the subtraction, while `Interlocked.Decrement` returns
the subtracted value, while our code assumed the functions behaved the same.
This resulted in a memory leak, because we'd incorrectly expect `0` to be
returned from `atomic_fetch_sub` when the reference count reaches zero, and
thus not detect when the descriptor a block should be freed.

The fix is to update the expected return value from `atomic_fetch_sub` to be
`1` instead of `0`.

Fixes xamarin#20503.
rolfbjarne added a commit that referenced this issue May 22, 2024
…al descriptors. Fixes #20503. (#20562)

We're using two different functions to atomically decrement a reference count,
the native `atomic_fetch_sub` and the managed `Interlocked.Decrement`.

Unfortunately the return value is not the same: `atomic_fetch_sub` returns the
original value before the subtraction, while `Interlocked.Decrement` returns
the subtracted value, while our code assumed the functions behaved the same.
This resulted in a memory leak, because we'd incorrectly expect `0` to be
returned from `atomic_fetch_sub` when the reference count reaches zero, and
thus not detect when the descriptor a block should be freed.

The fix is to update the expected return value from `atomic_fetch_sub` to be
`1` instead of `0`.

Fixes #20503.

Backport of #20556.

---------

Co-authored-by: Alex Soto <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug If an issue is a bug or a pull request a bug fix performance If an issue or pull request is related to performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants