Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent member name collision when proxy implements same generic interface more than twice #285

Merged

Conversation

stakx
Copy link
Member

@stakx stakx commented Jul 2, 2017

After #89, this is a second attempt at fixing #88.

When a proxy is made to implement the same generic interface more than twice, and that generic interface contains a method that does not use any generic type parameters, this can result in a member name collision. This would happen e.g. with a proxy implementing IObserver<T> three times.

DynamicProxy already checks for member name collisions, and when it detects one, it switches to explicit interface implementation and prefixes member names with the interface's type name. For the above example, this will add the following methods to the proxy:

OnCompleted
IObserver`1.OnCompleted
IObserver`1.OnCompleted

Because the actual type arguments are omitted and only their count is included, the current collision checks are not sufficient. This commit makes explicitly implemented methods' names unique by including the full type names of the generic type arguments; for instance:

OnCompleted
System.IObserver`1[[System.Int32, mscorlib, Version=…, Culture=…, PublicKeyToken=…]].OnCompleted
System.IObserver`1[[System.String, mscorlib, Version=…, Culture=…, PublicKeyToken=…]].OnCompleted

(I've omitted some detail above for brevity, using .)

This is about the simplest possible solution to #88, but I'm not sure if you are OK with these kinds of member names. They are unique, but also long and ugly (which could be partly mitigated by using System.CodeDom to produce more C#-like type names), and they contain whitespace and other special characters; I have no idea whether these are always safe for proxy consumers.

Feel free to request modifications or to close this PR if it's not acceptable.

@stakx stakx force-pushed the generic-interfaces-name-collisions branch from 91a5ea2 to 77cfec0 Compare July 4, 2017 09:02
@@ -213,10 +213,37 @@ public void Should_implement_explicitly_duplicate_interface_members()
MethodInfo method = type.GetMethod("Foo", BindingFlags.Instance | BindingFlags.Public);
Assert.IsNotNull(method);
Assert.AreSame(method, type.GetTypeInfo().GetRuntimeInterfaceMap(typeof(IIdenticalOne)).TargetMethods[0]);
MethodInfo method2 = type.GetMethod("IIdenticalTwo.Foo", BindingFlags.Instance | BindingFlags.Public);
MethodInfo method2 = type.GetMethod($"{typeof(IIdenticalTwo).FullName}.Foo", BindingFlags.Instance | BindingFlags.Public);
Copy link
Member Author

@stakx stakx Jul 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I chose not to hardcode the full type name here is because it includes things like DLL version numbers and public keys, which I wasn't sure whether they are the same on Mono and the CLR, and they're irrelevant to the test anyway.

@jonorossi
Copy link
Member

As you suspected, I'm not a fan of those member names. It appears they also won't be CLS compliant with the spaces: https://docs.microsoft.com/en-us/dotnet/standard/language-independence-and-language-independent-components#naming, I'm also not sure dots would be compliant but they don't seem to be mentioned. However, I'm not sure CLS compiliance really matters since no one is going to manually call them from language compiled code. I'd hate to see a stacktrace with them though, would be awfully confusing.

What do the CodeDom identifiers look like?

I'd be happy with using just the parameter short names (i.e. not namespace or assembly-qualified) which is what is already done for the first bit (e.g. IIdenticalTwo rather than Castle.DynamicProxy.Tests.Interfaces.IIdenticalTwo):

OnCompleted
IObserver`1.OnCompleted
IObserver`1.OnCompleted

OnCompleted
IObserver`1[Int32].OnCompleted  // <<<
IObserver`1[String].OnCompleted // <<<

The docs for Type.GetType state that the extra set of brackets is only needed around assembly-qualified types, so the format would look like this for multiple parameters:

IObserver`2[Int32,String].OnCompleted

Unless I missed it I can't see any property in the BCL on a type to get the unqualified name as well as its generic parameters, but should be easy to implement.

@jonorossi jonorossi added this to the vNext milestone Jul 4, 2017
@stakx
Copy link
Member Author

stakx commented Jul 4, 2017

What do the CodeDom identifiers look like?

I only mentioned this as a potential route to take, I haven't actually tried it out just yet. The general idea is that CodeDom can be used to create type names that look more like C# rather than IL (e.g., <> brackets for generic type arguments instead of the backtick and the [] brackets). My understanding of this technique is that it is customizable to some extent (e.g. whether you want to include namespaces or not). That all being said, it might actually be better to keep an additional subsystem like CodeDom out of the picture, if it's not absolutely required.

IObserver`2[Int32,String].OnCompleted

I like that suggestion. The resulting member names won't be guaranteed to be unique (e.g. if you use two types NamespaceA.Foo and NamespaceB.Foo as type arguments), but in practice, that probably won't matter at all in most cases.

Shall I go ahead and change the PR accordingly?

@jonorossi
Copy link
Member

That all being said, it might actually be better to keep an additional subsystem like CodeDom out of the picture, if it's not absolutely required.

Completely in agreement, not even sure if .NET Core has CodeDom.

The resulting member names won't be guaranteed to be unique

Agreed, it'll be exactly like the base type right now, you'd be silly to inherit/implement two Foo classes/interfaces 😉

Shall I go ahead and change the PR accordingly?

Yep, don't forget the changelog. Thanks.

When a proxy is made to implement the same generic interface more than
twice, and that generic interface contains a method that does not use
any generic type parameters, this can be result in a name collision.
This would happen e.g. with a proxy implementing `IObserver<T>` three
times.

DynamicProxy does check for member name collisions, and when such a
one is detected, it switches to explicit interface implementation and
prefixes the member name with the generic interface type name. For the
above example, this will add the following methods to the proxy:

 * OnCompleted
 * IObserver`1.OnCompleted
 * IObserver`1.OnCompleted

It is clearly not sufficient to exclude the concrete type arguments
and use only the number of type parameters. This commit changes the
naming of explicitly implemented interface methods such that they are
unique by including the generic type arguments. For example:

 * OnCompleted
 * IObserver`1[Boolean].OnCompleted
 * IObserver`1[Int32].OnCompleted
@stakx stakx force-pushed the generic-interfaces-name-collisions branch from 77cfec0 to 55192ae Compare July 4, 2017 20:19
@stakx
Copy link
Member Author

stakx commented Jul 4, 2017

I think something might still be missing here: This PR only takes care of method name collisions, but does nothing about events and properties. I'll try to see if a similar change needs to be made to the other types overriding MetaTypeElement.SwitchToExplicitImplementation (MetaEvent.cs, MetaProperty.cs).

{
stringBuilder.Append('[');
var genericTypeArguments = type.GetGenericArguments();
for (int i = 0, n = genericTypeArguments.Length; i < n; ++i)
Copy link

@ghost ghost Jul 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method might introduce a performance hit on generation. It is recursive.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just basing that on a rudimentary find of usages for MetaMethod.

image

Copy link
Member Author

@stakx stakx Jul 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. The recursion depth will be proportional to the nesting depth of the (generic) type for which it generates a name. Non-generic types, however, will not trigger any recursion.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fixes the original problem, but should we flag performance as a feature and make it part of the proxy options with a sensible default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the performance will be affected. This only gets run when SwitchToExplicitImplementation is called once per MetaMethod and it is then cached in its name field. The StringBuilder will perform pretty similar to the existing string.Format and users are rarely going to have more than a single level of generic nesting, and even then the cost here is negligible compared to the actual type generation or other work DP does.

@jonorossi
Copy link
Member

Looks good so far, just need to address properties and events.

@stakx stakx force-pushed the generic-interfaces-name-collisions branch from 5047f76 to b3ea5a3 Compare July 5, 2017 21:41
@stakx
Copy link
Member Author

stakx commented Jul 5, 2017

@jonorossi - Done.

I've extracted the logic for creating explicit implementation member names into a separate helper class MetaTypeElementUtil, since it's identical for all types of members (methods, events, and properties).

This opens up an opportunity to minimize unnecessary heap allocations. @fir3pho3nixx, like you showed in your review before, generating names for explicitly implemented members is possibly on a hot code path, so it should be reasonably efficient. It would be silly, for instance, having to instantiate 3 StringBuilder objects for just one property (1 for the property name, 1 for the getter method name, 1 for the setter method name). That's why I introduced a [ThreadStatic] StringBuilder that gets reused over and over again.

@ghost
Copy link

ghost commented Jul 5, 2017

@stakx - In this context I think it plays well with async. There is a caution about the use of ThreadStatic. We also switched this out for use of AsyncLocal for netcore.

@stakx
Copy link
Member Author

stakx commented Jul 5, 2017

@fir3pho3nixx - Thanks for checking up on my use of [ThreadStatic]. I was initially unsure whether it was really that easy to work with, but I've come to believe it should be perfectly fine in this particular case.

If my understanding of the TPL and async / await is correct, that Stack Overflow question revolves around the topic of task scheduling: is the continuation (i.e. the code after the await) guaranteed to execute on the same thread as the code before the await? (And the general answer to that is "no".)

I am almost certain that won't be an issue here, for the simple reason that our code in question is entirely synchronous. It should be impossible that code execution would switch from one thread to another all of a sudden and thus end up "stealing" the original thread's StringBuilder.

Unless I'm mistaken with any or all of the above, AsyncLocal<T> would be overkill here, because we have no Task or async / await induced async boundary across which we need to persist state.

@ghost
Copy link

ghost commented Jul 6, 2017

@stakx - I agree. It is all sync so it should be fine.

@jonorossi
Copy link
Member

In this context I think it plays well with async

Agreed. In general ThreadStatic will not work correctly in async/await, however there are no continuations in this code.

This opens up an opportunity to minimize unnecessary heap allocations.

What concerns me is that ThreadStatic and TLS are not free, the CLR has to call a helper function to access TLS to fetch that pointer for you, and this means the JIT can't aggressively optimise around accesses to this field. StringBuilders are fairly cheap and they'd only end up in Gen0 so would be efficiently garbage collected, and would mean StringBuilders aren't left around on the heap with pointers also left in TLS after generation is done (especially something long lived like a web app). I haven't done any performance tests, but I did come across one comment that said accessing a ThreadStatic field can be slower than an array allocation on .NET 3.5 (StringBuilder's constructor is pretty much just doing a char array allocation).

With that said the .NET Framework does actually have an internal StringBuilderCache class which is used by methods including String.Format, String.Join and String.Concat(IEnumerable<>). The class is a little more complex to optimise reuse of StringBuilders with enough capacity, because of fragmentation and internally StringBuilders are a linked list of StringBuilders to prevent any of it being allocated on the large object heap.

In this case it appears there is enough evidence that this will likely improve performance by some margin, but the performance improvement is currently unknown and likely negligible compared to the reflection calls to get the type names along with generic types; and this is all small compared to the reflection emit time.

It is best to avoid premature optimization (especially premature micro-optimisations) without measuring/benchmarking especially when the readability is hindered. For example, you could add a fast path to CreateNameForExplicitImplementation that avoids the StringBuilder if it isn't a generic type and call string.Concat(type.Name, ".", name) so it hits String's FastAllocateString method.

Let me know your thoughts, and if you disagree.

@stakx
Copy link
Member Author

stakx commented Jul 6, 2017

@jonorossi: Your response makes perfect sense to me.

I've heard similar things about the relatively large overhead of [ThreadStatic] on .NET 3.5. This appears to have been remedied in .NET 4, where the overhead of using [ThreadStatic] is said to be almost negligible. (It's also used in .NET 4's StringBuilderCache, after all.)

My overall impression is that using [ThreadStatic] would probably be slightly beneficial on .NET 4 and later, but hurt performance on .NET 3.5, so I'm adding a commit that removes this premature optimization. If ever you decide to drop support for .NET 3.5, you might want to revert that commit.

@stakx stakx force-pushed the generic-interfaces-name-collisions branch from 01d4085 to d5c941f Compare July 6, 2017 18:19
@ghost
Copy link

ghost commented Jul 6, 2017

Thanks for making this thread interesting. Did not know threadstatic performance issue existed(or was fixed), just generally avoided it like herpes altogether :) Learn't something heaps 👍

@jonorossi
Copy link
Member

@stakx thanks for making the change, and for getting this fix in. Merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants