Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opaque types should not be HIR items (and thus not be HIR owners) #129023

Closed
camelid opened this issue Aug 12, 2024 · 8 comments · May be fixed by #129244
Closed

Opaque types should not be HIR items (and thus not be HIR owners) #129023

camelid opened this issue Aug 12, 2024 · 8 comments · May be fixed by #129244
Assignees
Labels
A-hir Area: The high-level intermediate representation (HIR) C-cleanup Category: PRs that clean code up or issues documenting cleanup. E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@camelid
Copy link
Member

camelid commented Aug 12, 2024

Summary

  • Delete hir::ItemKind::OpaqueTy, add hir::Node::OpaqueTy
  • Replace the ItemId in hir::TyKind::OpaqueDef with &'hir OpaqueTy
  • Fix the fallout

A WIP version of much of this can be seen in 79a3e7a...dcb9e9a, but it's incomplete. The reason is that many parts of the compiler assume that opaques are items and handle them through implicit codepaths.

Background

Currently, opaque types (e.g., return-position impl Trait) are lowered into a hir::ItemKind::OpaqueTy and then a hir::TyKind::OpaqueDef that references that item's ID. However, because opaques' DefIds are created in rustc_ast_lowering, yet opaques can contain nodes whose DefIds are created in DefCollector (such as anon consts, which may in turn contain arbitrary nested items), this causes an incorrect parenting structure. This issue was uncovered in #128844, and it impeded that change, which delayed creation of defs for all expression-like nodes (anon consts, closures, etc.) until ast_lowering to fix other issues relating to def parents. @BoxyUwU summarized the issue in that PR's thread, reproduced here:

Context for why the Item::OpaqueTy changes since @cjgillot has started looking at this PR:

note: "this PR" indicates "this PR before the changes to opaques" as this was written from before those changes were made

Given the code example:

trait Trait<const N: usize> {}

#[rustfmt::skip]
fn foo() -> impl Trait<{ struct Bar; 1 }> {}

impl Trait<1> for () {}

The free_items for this module are:

free_items: [
  ItemId { owner_id: DefId(0:1 ~ playground_1[9aa8]::{use#0}) },
  ItemId { owner_id: DefId(0:2 ~ playground_1[9aa8]::std) },
  ItemId { owner_id: DefId(0:3 ~ playground_1[9aa8]::Trait) },
  ItemId { owner_id: DefId(0:7 ~ playground_1[9aa8]::foo) },
  ItemId { owner_id: DefId(0:11 ~ playground_1[9aa8]::foo::{opaque#0}) },
  ItemId { owner_id: DefId(0:9 ~ playground_1[9aa8]::foo::{constant#0}::Bar) },
  ItemId { owner_id: DefId(0:5 ~ playground_1[9aa8]::{impl#0}) }
]

so, HirIdValidator gets used on foo, opaque#0 and Bar, but not the anon const

* `foo` it will recurse and call `visit_nested_item` on the opaque which doesn't recurse, and just checks that the parent `DefId`'s hir's owner is `foo`

* `opaque#0`it will recurse and... check the `HirId` of the anon const is right (for some definition of right, unrelated to any of this), then recurse inside it, then visit `Bar` as a nested item and check that: `Bar`'s def parent's hir's owner is `opaque#0`,
  
  * On nightly that def parent is the anon const, and the anon consts hir is indeed owned by opaque#0
  * On this PR where items (that arent nested bodies) that are in anon consts are not parented to the anon const, `Bar`'s def parent is `foo`, so we attempt to check that `foo`'s hir's owner is `opaque#0` which it is not so we ICE

On nightly: The def parent setup for Bar is foo::{constant}::Bar, foo and opaque#0 are hir owners. When we validate opaque's HirId's, it succeeds as we ignore anon consts, and the fact that constant exists means that Bar's parent's hir is owned opaque. On this PR: The def parent setup for bar is foo::Bar, foo and opaque#0 are hir owners. When we validate opaques HirIds, it fails as constant#0 is no longer the parent of Bar so it correctly detects that we have a nested thing in opaque#0 with a parent that is "outside" of opaque in a different hir owner's hir

Conclusion: Opaque types either need to have their DefId created in DefCollector or needs to stop being a hir owner. Or, generally, if something is to be a hir owner node, the DefId for it should be created before all children DefIds or else you cannot create the parent structure correctly.

As mentioned above, the alternative is to create DefIds for opaques ahead-of-time, in DefCollector, but that is unsatisfactory because lowering of opaque types is quite complicated and that logic exists in rustc_ast_lowering. IMO it also makes more intuitive sense to create defs for things that don't themselves participate in resolution, such as anon consts and opaque types, during ast_lowering rather than def collection.

@camelid camelid added C-cleanup Category: PRs that clean code up or issues documenting cleanup. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-hir Area: The high-level intermediate representation (HIR) E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. labels Aug 12, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Aug 12, 2024
@camelid camelid removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Aug 12, 2024
@compiler-errors compiler-errors added E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. and removed E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. labels Aug 12, 2024
@compiler-errors compiler-errors self-assigned this Aug 12, 2024
@compiler-errors
Copy link
Member

I talked to Boxy and I'd like to take a look at this, unless someone like @cjgillot knows how to fix it and has the time. I've relabeled this as E-hard since it likely requires a lot of patience going through the HIR visitors to ensure they're handling opaques correctly now.

camelid added a commit to camelid/rust that referenced this issue Aug 16, 2024
Ever since rust-lang#125915, some `ast::AnonConst`s turn into `hir::ConstArgKind::Path`s,
which don't have associated `DefId`s. To deal with the fact that we don't have
resolution information in `DefCollector`, we decided to implement a process
where if the anon const *appeared* to be trivial (i.e., `N` or `{ N }`), we
would avoid creating a def for it in `DefCollector`. If later, in AST lowering,
we realized it turned out to be a unit struct literal, or we were lowering it
to something that didn't use `hir::ConstArg`, we'd create its def there.

However, let's say we have a macro `m!()` that expands to a reference to a free
constant `FOO`. If we use `m!()` in the body of an anon const (e.g., `Foo<{ m!() }>`),
then in def collection, it appears to be a nontrivial anon const and we create
a def. But the macro expands to something that looks like a trivial const arg,
but is not, so in AST lowering we "fix" the mistake we assumed def collection
made and create a def for it. This causes a duplicate definition ICE.

The long-term fix for this is to delay the creation of defs for all expression-like
nodes until AST lowering (see rust-lang#128844 for an incomplete attempt at this). This
would avoid issues like this one that are caused by hacky workarounds. However,
doing this uncovers a pre-existing bug with opaque types that is quite involved
to fix (see rust-lang#129023).

In the meantime, this PR fixes the bug by delaying def creation for anon consts
whose bodies are macro invocations until after we expand the macro and know
what is inside it. This is accomplished by adding information to create the
anon const's def to the data in `Resolver.invocation_parents`.
@cjgillot
Copy link
Contributor

I posted a draft here: #129244

@petrochenkov
Copy link
Contributor

petrochenkov commented Aug 19, 2024

As mentioned above, the alternative is to create DefIds for opaques ahead-of-time, in DefCollector, but that is unsatisfactory because lowering of opaque types is quite complicated and that logic exists in rustc_ast_lowering

What are the complications exactly?
To created a DefId you need to 1) see impl Trait which is visible syntactically and 2) know that it is in return position and not argument position, that information is already tracked by def collector.

IMO it also makes more intuitive sense to create defs for things that don't themselves participate in resolution, such as anon consts and opaque types, during ast_lowering rather than def collection.

Ideally I'd like to see a design document (maybe a compiler MCP) describing what are the requirements on the DefId tree, what we want from the DefId parenting relation in particular.
I haven't read all the relevant PRs yet (#129137 (comment)), but so far everything starting from #125915 seems to be going into a wrong direction to me.

Ideally the whole DefId tree should be built in def collector, as I see it, because we can create "fully capable" def ids there, connected by parenting relations with macro calls in particular.
We can create additional "dummy" DefIds if for some node it is unclear at def collection time whether it really needs the DefId or not (this is relevant to min_generic_const_args).

If creation is delayed for some DefId, then it will suffer from inability to have children subtrees. Is it acceptable for some nodes? Maybe, then it should be documented why it is acceptable.
For some nodes, like closures, it doesn't look acceptable - if we skip closures from the parenting relations then why not continue and skip e.g. fn items as well.

@petrochenkov
Copy link
Contributor

Def id creation for impl Traits was actually performed in def collector not so long ago and was moved to AST lowering in #102483.

@petrochenkov
Copy link
Contributor

petrochenkov commented Aug 19, 2024

For impl Trait the questions of being or not being an item, and late or early DefId creation are sort of orthogonal (not entirely).
All four alternatives are possible:

The "created early" alternatives seem preferable to me, because the node is properly integrated into the def id tree this way.
The choice between "item" and "not item" needs to be made based on some other criteria, maybe "not item" is indeed better, I haven't carefully looked at #129244 yet.

@petrochenkov
Copy link
Contributor

Ideally the whole DefId tree should be built in def collector, as I see it, because we can create "fully capable" def ids there, connected by parenting relations with macro calls in particular.

The directly opposite alternative is to delay creation of DefIds for all nodes that do not own nontrivial HIR.
What the consequences of that will be:

  • Some unclear effect on incremental compilation - less deep def paths, but more disambiguators
  • Less information for the compiler logic to work on, which may make it hard to migrate compiler code (diagnostics in particular) to the new scheme
    • For example, enum variants are not HIR owners, but it's very much reasonable to expect that def ids of variants are parents of their fields' def ids
      • A lot of compiler logic likely relies on that
      • We probably cannot delay def id creation for variants because they have names and are integrated into the module structure, but if we could, then we'd need to change HIR node parenting as well, because if 2 def ids are not related by parenting, but 2 their corresponding hir ids are related by parenting, then it doesn't look good and may cause various issues. Are the same issues relevant to e.g. closures or impl traits? Most likely.

Some schemes in the middle between these two opposite alternatives are possible, I'm again not sure why we'd choose them (but that's again a question of what we want from the def id tree in general #129023 (comment)).
Creating everything in def collector (maybe creating more than strictly necessary) still looks like the simplest solution to me.

bors added a commit to rust-lang-ci/rust that referenced this issue Aug 22, 2024
Make opaque types regular HIR nodes

Having opaque types as HIR owner introduces all sorts of complications. This PR proposes to make them regular HIR nodes instead.

I haven't gone through all the test changes yet, so there may be a few surprises.

Many thanks to `@camelid` for the first draft.
Fixes rust-lang#129023

Fixes rust-lang#129099
Fixes rust-lang#125843
Fixes rust-lang#119716
Fixes rust-lang#121422
@cjgillot
Copy link
Contributor

On the general direction, I agree we should strive to have all non-leaf definitions created in the defcollector. For the simple reason that this ensures consistency of the def tree.

One reason we create impltrait definitions late is that the defcollector was buggy, and having the desugaring in a single place was simpler. But I'd like to try reverting it and fix def collector.

camelid added a commit to camelid/rust that referenced this issue Aug 25, 2024
Ever since rust-lang#125915, some `ast::AnonConst`s turn into `hir::ConstArgKind::Path`s,
which don't have associated `DefId`s. To deal with the fact that we don't have
resolution information in `DefCollector`, we decided to implement a process
where if the anon const *appeared* to be trivial (i.e., `N` or `{ N }`), we
would avoid creating a def for it in `DefCollector`. If later, in AST lowering,
we realized it turned out to be a unit struct literal, or we were lowering it
to something that didn't use `hir::ConstArg`, we'd create its def there.

However, let's say we have a macro `m!()` that expands to a reference to a free
constant `FOO`. If we use `m!()` in the body of an anon const (e.g., `Foo<{ m!() }>`),
then in def collection, it appears to be a nontrivial anon const and we create
a def. But the macro expands to something that looks like a trivial const arg,
but is not, so in AST lowering we "fix" the mistake we assumed def collection
made and create a def for it. This causes a duplicate definition ICE.

The long-term fix for this is to delay the creation of defs for all expression-like
nodes until AST lowering (see rust-lang#128844 for an incomplete attempt at this). This
would avoid issues like this one that are caused by hacky workarounds. However,
doing this uncovers a pre-existing bug with opaque types that is quite involved
to fix (see rust-lang#129023).

In the meantime, this PR fixes the bug by delaying def creation for anon consts
whose bodies are macro invocations until after we expand the macro and know
what is inside it. This is accomplished by adding information to create the
anon const's def to the data in `Resolver.invocation_parents`.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Sep 1, 2024
…chenkov

Create opaque definitions in resolver.

Implementing rust-lang#129023 (comment)

That was easier than I expected.

r? `@petrochenkov`
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Sep 1, 2024
Rollup merge of rust-lang#129493 - cjgillot:early-opaque-def, r=petrochenkov

Create opaque definitions in resolver.

Implementing rust-lang#129023 (comment)

That was easier than I expected.

r? `@petrochenkov`
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Sep 2, 2024
Create opaque definitions in resolver.

Implementing rust-lang/rust#129023 (comment)

That was easier than I expected.

r? `@petrochenkov`
camelid added a commit to camelid/rust that referenced this issue Sep 8, 2024
Ever since rust-lang#125915, some `ast::AnonConst`s turn into `hir::ConstArgKind::Path`s,
which don't have associated `DefId`s. To deal with the fact that we don't have
resolution information in `DefCollector`, we decided to implement a process
where if the anon const *appeared* to be trivial (i.e., `N` or `{ N }`), we
would avoid creating a def for it in `DefCollector`. If later, in AST lowering,
we realized it turned out to be a unit struct literal, or we were lowering it
to something that didn't use `hir::ConstArg`, we'd create its def there.

However, let's say we have a macro `m!()` that expands to a reference to a free
constant `FOO`. If we use `m!()` in the body of an anon const (e.g., `Foo<{ m!() }>`),
then in def collection, it appears to be a nontrivial anon const and we create
a def. But the macro expands to something that looks like a trivial const arg,
but is not, so in AST lowering we "fix" the mistake we assumed def collection
made and create a def for it. This causes a duplicate definition ICE.

The long-term fix for this is to delay the creation of defs for all expression-like
nodes until AST lowering (see rust-lang#128844 for an incomplete attempt at this). This
would avoid issues like this one that are caused by hacky workarounds. However,
doing this uncovers a pre-existing bug with opaque types that is quite involved
to fix (see rust-lang#129023).

In the meantime, this PR fixes the bug by delaying def creation for anon consts
whose bodies are macro invocations until after we expand the macro and know
what is inside it. This is accomplished by adding information to create the
anon const's def to the data in `Resolver.invocation_parents`.
@BoxyUwU
Copy link
Member

BoxyUwU commented Sep 12, 2024

With #129493 having landed I think we can close this ^^ The problem of "its not possible to parent defs inside of opaque types correctly" has been resolved

@BoxyUwU BoxyUwU closed this as completed Sep 12, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Sep 13, 2024
Fix anon const def-creation when macros are involved

Fixes rust-lang#128016.

Ever since rust-lang#125915, some `ast::AnonConst`s turn into `hir::ConstArgKind::Path`s,
which don't have associated `DefId`s. To deal with the fact that we don't have
resolution information in `DefCollector`, we decided to implement a process
where if the anon const *appeared* to be trivial (i.e., `N` or `{ N }`), we
would avoid creating a def for it in `DefCollector`. If later, in AST lowering,
we realized it turned out to be a unit struct literal, or we were lowering it
to something that didn't use `hir::ConstArg`, we'd create its def there.

However, let's say we have a macro `m!()` that expands to a reference to a free
constant `FOO`. If we use `m!()` in the body of an anon const (e.g., `Foo<{ m!() }>`),
then in def collection, it appears to be a nontrivial anon const and we create
a def. But the macro expands to something that looks like a trivial const arg,
but is not, so in AST lowering we "fix" the mistake we assumed def collection
made and create a def for it. This causes a duplicate definition ICE.

The long-term fix for this is to delay the creation of defs for all expression-like
nodes until AST lowering (see rust-lang#128844 for an incomplete attempt at this). This
would avoid issues like this one that are caused by hacky workarounds. However,
doing this uncovers a pre-existing bug with opaque types that is quite involved
to fix (see rust-lang#129023).

In the meantime, this PR fixes the bug by delaying def creation for anon consts
whose bodies are macro invocations until after we expand the macro and know
what is inside it. This is accomplished by adding information to create the
anon const's def to the data in `Resolver.invocation_parents`.

r? `@BoxyUwU`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-hir Area: The high-level intermediate representation (HIR) C-cleanup Category: PRs that clean code up or issues documenting cleanup. E-hard Call for participation: Hard difficulty. Experience needed to fix: A lot. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants