Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safety rules for ref-like types like Span<T> #264

Closed
wants to merge 5 commits into from

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented Mar 13, 2017

Please do not merge yet.


There are three reasons why `Span<T>` must be a stack-only type.

1. `Span<T>` is basically a struct containing `{reference to data, offset limit}`. Concurrent "tearing" of such struct would lead to the possibility of out of range accesses and type-safety violations, which basically means "GC heap corruption".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would give the tuple example here. Essentially: Span<T> is logically a (ref T data, int length) structure.


In this document the "ref like" types include only `Span<T>` and related types. And the ref-like variables mean variables of "ref-like" types.

In order to force ref-like variables to be stack only,we need the following restrictions:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a space after comma


Span<T> Caller()
{
Span<T> safeToReturn Span<T> safeToReturn;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplication.


void Callee(ref Span<int> arg1)
{
// local refernces are escaping.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: refernces


## `ref-like` types must be stack-only. ##

C# compiler already has a concept of a restricted types that covers special platform types such as `TypedReference`. In some cases those types are referred as ref-like types as well. We should probably use some different term to refer to `TypedReference` and similar types - like `restricted types` to avoid confusion.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a restricted types

Should be a restricted type


```

Note: returning by a ordinary writeable reference appears to be ok by itself, but with values on heap being impossible, values in local frame being not returnable and variables passed from the caller being readonly, it may not matter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a ordinary

an ordinary

2. Some implementations of `Span<T>` literally contain a managed pointer in one of its fields. Managed pointers are not supported as fields of heap objects and code that manages to put a managed pointer on the GC heap typically crashes at JIT time.
3. It is permitted for a `Span<T>` to refer to data in the local stack frame - individual local variables or `stackalloc`-ed arrays. A scenario when an instance of a `Span<T>` outlives the referred data would lead to undefined behavior, including type-safety violations and heap corruptions.

All the above problems would be alleviated if instances of `Span<T>` would be allowed as stack data types only. In fact for the `#3` we need a stronger guarantee. Details on that to follow below.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Details on that to follow below.

Can you add a link/reference to the "details"?

@jkotas
Copy link
Member

jkotas commented Mar 17, 2017

BTW: Any thoughts on how the ref-like types are going to be identified?


## `ref-like` types must be stack-only. ##

C# compiler already has a concept of a restricted types that covers special platform types such as `TypedReference`. In some cases those types are referred as ref-like types as well. We should probably use some different term to refer to `TypedReference` and similar types - like `restricted types` to avoid confusion.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also byref-like type defined in ECMA spec. I think that one is fine - we are just making it to cover more types and relaxing the rules.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas - any suggestions?
Ideally it should be something that makes Span unconsumable by old code - like a modreq, but it is not possible to put modreq on a type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use an ordinary attribute, but it is easily ignorable.

Copy link
Member

@jkotas jkotas Mar 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, the solution we have discussed was to mark Span with ObsoleteAttribute with special message pattern, and make newer compilers to not emit warnings for ObsoleteAttribute with this special pattern. Check notes https://github.com/dotnet/csharplang/blob/master/meetings/2016/LDM-2016-11-01.md

- ref-like type cannot be a type of an array element
- ref-like type cannot be used as a generic type argument
- ref-like variable cannot be boxed
- ref-like type cannot be a field of ordinary not ref-like type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should "ref-like type cannot be a static field of any type" be added to this list?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ref-like type cannot be a type of a static field. It can only be an instance field of another ref-like type.

## ref-like variables can be passed as `in` parameters ##

- it is ok to pass a ref-like variable by reference to a call as long as the reference is **_readonly_** .
The reason is that exposing ref-like variables via writeable references make it nearly impossible to reason about their safety.
Copy link
Member

@MichalStrehovsky MichalStrehovsky Mar 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need any special rules for unmanaged pointers? When I type unsafe, I'll still be able to e.g. stackalloc an array of spans (useful), or shoot myself in the foot (not so useful), like I can with TypedReference, right?

using System;

unsafe class Program
{
    static void Leak(TypedReference* tr)
    {
        int i = 456;

        // Whoops
        *tr = __makeref(i);
    }

    static void Main()
    {
        TypedReference tr;
        Leak(&tr);
        Console.WriteLine(__refvalue(tr, int));
    }
}

(I look at this as the intended behavior of unsafe, but a blurb in the sense of "we thought about it and it's by design" might be useful.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe code is a tough subject. Some unsafe code has well-defined behavior even if not safe. Your example may actually run just fine depending on implementation.
C# as a language generally does not have a lot of guarantees about unsafe code.

I'd say the sample is legal, but its behavior is implementation specific and thus undefined.

Copy link
Member

@KrzysztofCwalina KrzysztofCwalina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does "receiver" in "ref-like fields are safe to return as long as the receiver is safe to return" mean?

The most permissive approach would be alias/value tracking via fixed point data-flow analysis. It is very expensive and complex approach to be practical, but will be called out here for completeness.

Simpler solutions are:
- make ref-like variables "single-assignment" similarly to the solution we have for ordinary ref variables.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can make them single assignment. It's super common to do span = span.Slice(count);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KrzysztofCwalina - would it be ok to allow this only for spans that are free from referring to local data?

With local data you would have to do 'var anotherSpan = span.Slice(count);`

The reason is that inside a method could be many nested scopes and it would be very hard to enforce that variables from outer scopes do not end up referring indirectly to variables from inner scopes if they can be reassigned.
On the other hand, everything outside of the method lives as long as you have a reference to it, so it is safe to assign to anything.


## `ref-like` types must be readonly structs. ##

- ref-like types must be readonly.
Copy link
Member

@ektrah ektrah Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to have a ref-like type that is not readonly if all ref-like fields are readonly?

Use case: I'm currently passing around mutable structs by reference for incrementally reading from a ReadOnlySpan (struct, usage) / incrementally writing to a Span (struct, usage).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ektrah - I need to think about it, but at first glance it seems unsafe. Any method on a non-readonly struct would get "this" by a read-write reference and as such can completely replace it with anything:

SomeMethod()
{
    // we can check if r1 is safe to return based on `arg1` and `arg2`
    Asn1Reader r1 =  GetReader(arg1, arg2);

    r1.Foo();

    // can we return or even use r1 after the call above?
    return r1;
}

struct Asn1Reader
{
      public void Foo()
      {
            // this will invalidate the assumptions that caller could make about `s1`
            this = new Asn1Reader(new ReadonlySpan<byte>(stackalloc byte[16]));
      }
}

Copy link
Member

@ektrah ektrah Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VSadov - stackalloc requires Foo to be unsafe, which probably means that Foo can do unsafe things with this and invalidate assumptions even if the type is readonly. This might be a better example:

struct MyStruct
{
    private ref int _x;

    public MyStruct(ref int x)
    {
        ref _x = ref x;
    }

    public void Bar()
    {
        int y = 123;
        this = new MyStruct(ref y); // error
    }

    public MyStruct Baz()
    {
        int y = 123;
        return new MyStruct(ref y); // error
    }

    public void Qux(out MyStruct result)
    {
        int y = 123;
        result = new MyStruct(ref y); // error
    }
}

To me it seems that assigning to a ref-like this should follow the same rules as returning a ref-like type, i.e., the assignment in Bar should be rejected for the same reason as the return statement in Baz.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ektrah - your examples are correct.

The idea with stacalloc is to have some syntax that would allow wrapping stack allocated data into a span (without leaking the ptr of course). Such construct could be used without "unsafe", since it would be type-safe, but we would need to make the resulting span not-returnable.

new ReadonlySpan<byte>(stackalloc byte[16]) is one possible example of such syntax. It is TBD though.

Copy link
Member

@ektrah ektrah Mar 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VSadov - A safe stackalloc would be nice. I'm already using unsafe stackalloc with Span<T> quite a few times: 1, 2, 3, 4, 5.

And it would really, really nice to be able to encapsulate a ref-like type together with mutable state, as I currently do in the structs linked above.

- ref-like type cannot be used as a generic type argument
- ref-like variable cannot be boxed
- ref-like type cannot be a field of ordinary not ref-like type
- indirect restrictions, such as disallowed use of ref-like types in async methods, which are really a result of disallowing ref-like typed fields.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about edge cases, where ref-like types could be allowed? For example:

  1. Values in async methods that are immediately passed to another method. E.g. even inside an async method, F(GetRefLike()) or F(await task, GetRefLike()) could be allowed (though F(GetRefLike(), await task) couldn't).
  2. Locals in async methods with no awaits.
  3. Locals that are part of a closure of a "simple" local function (i.e. one whose closure is a struct, not a class).

I'm not sure what would be the value in allowing such cases, especially since the rules could become byzantine and closely tied to implementation details of the compiler. But I think it's worth considering.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can look at preexisting behavior of types like TypedReference here. To make rules simple we may outright disallow ref-like types in async methods. Disallowing just the cases when ref-like is captured could lead to cases where something works in optimized code and does not in debug.

@VSadov
Copy link
Member Author

VSadov commented Apr 21, 2017

Based on feedback, the rules around passing span like types byref have been refined.
Because of that some of the feedback here has become inapplicable. The rest have been applied to the updated draft.

Since the discussion here is fairly long and concerns are captured in the updated document, I am closing this to start new discussion here:
#472

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants