Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle str literals written with ' lexed as lifetime #122217

Merged
merged 6 commits into from
Mar 24, 2024

Conversation

estebank
Copy link
Contributor

@estebank estebank commented Mar 9, 2024

Given 'hello world' and `'1 str', provide a structured suggestion for a valid string literal:

error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-3.rs:2:26
   |
LL |     println!('hello world');
   |                          ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("hello world");
   |              ~           ~
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-1.rs:2:20
   |
LL |     println!('1 + 1');
   |                    ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("1 + 1");
   |              ~     ~

Fix #119685.

@rustbot
Copy link
Collaborator

rustbot commented Mar 9, 2024

r? @lcnr

rustbot has assigned @lcnr.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 9, 2024
@rust-log-analyzer

This comment has been minimized.

@lcnr
Copy link
Contributor

lcnr commented Mar 11, 2024

r? compiler

@rustbot rustbot assigned fmease and unassigned lcnr Mar 11, 2024
Copy link
Member

@fmease fmease left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, thanks! I have a couple of minor suggestions and a question. And of course, the negative overflow needs to be fixed.

If I follow the code correctly, for r'hello world' (a meant-to-be raw string literal), we will still emit two errors (prefix `r` is unknown & unterminated character literal). It's probably fine as-is but we could also think about suppressing the unknown prefix error if feasible 🤔

compiler/rustc_parse/src/lexer/mod.rs Outdated Show resolved Hide resolved
tests/ui/lexer/lex-bad-str-literal-as-char-2.stderr Outdated Show resolved Hide resolved
Copy link
Member

@fmease fmease Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check if this duplicates any preexisting UI tests whose stderr contains character literal may only contain one codepoint / if you meant to write a `str` literal, use double quotes and if so remove one in favor of the other or consolidate them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The closest one to duplicating this case is the println!('●●') one, but I feel that having one for an unicode char and another for a whitespace is worthwhile keeping. Also, these are all separate files because lexer errors are early enough that the swallow later errors (so earlier lexer errors would silence later ones).

tests/ui/lexer/lex-bad-str-literal-as-char-3.rs Outdated Show resolved Hide resolved
tests/ui/lexer/lex-bad-str-literal-as-char-3.rs Outdated Show resolved Hide resolved
@fmease fmease added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 11, 2024
@estebank estebank added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 14, 2024
@rust-log-analyzer

This comment has been minimized.

Given `'hello world'` and `'1 str', provide a structured suggestion for a valid string literal:

```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-3.rs:2:26
   |
LL |     println!('hello world');
   |                          ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("hello world");
   |              ~           ~
```
```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-1.rs:2:20
   |
LL |     println!('1 + 1');
   |                    ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("1 + 1");
   |              ~     ~
```

Fix rust-lang#119685.
Copy link
Member

@fmease fmease left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two questions, otherwise r=me with or without nits addressed at your convenience. Sorry for the delay :| and thanks for taking this on! :)

fn main() {
println!('hello world'); //~ ERROR unterminated character literal
println!('hello world');
//[rust2015,rust2018,rust2021]~^ ERROR unterminated character literal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be simplified to

Suggested change
//[rust2015,rust2018,rust2021]~^ ERROR unterminated character literal
//~^ ERROR unterminated character literal

but not blocking.

@@ -59,6 +59,15 @@ impl<'a> Cursor<'a> {
iter.next().unwrap_or(EOF_CHAR)
}

/// Peeks the third symbol from the input stream without consuming it.
pub fn third(&self) -> char {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since second and third are exclusively used in the error path if I'm not mistaken (haven't double-checked), perf shouldn't matter that much and we can maybe avoid introducing those helpers? Idk, do we have access to self.chars in the parser? If so, we could just use .clone().nth(N), right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.chars is private to the lexer. I initially made it public, but given it is used only in one place it felt better to introduce the method. But then we could instead just have an nth method that expands to .clone().nth(N) instead.

Comment on lines +722 to +727
let err = errors::UnknownPrefix { span: prefix_span, prefix, sugg };
if silence {
self.dcx().create_err(err).delay_as_bug();
} else {
self.dcx().emit_err(err);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably structure it a bit differently, namely let err = create_err(); if silence { err.delay_as_bug() } else { err.emit() } but that's just a difference in taste. Definitely not blocking.

@@ -95,7 +95,7 @@ pub(crate) fn emit_unescape_error(
}
escaped.push(c);
}
if escaped.len() != lit.len() {
if escaped.len() != lit.len() || full_lit_span.is_empty() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this prevents the negative overflow & that makes sense but why wasn't this an issue before? Don't have the time to dig deep into the code myself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we were replacing the full span with the new string (instead of doing BytePos math). We just end up with an invalid suggestion that won't be displayed.

@fmease
Copy link
Member

fmease commented Mar 23, 2024

Okay, seems fine as is.

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Mar 23, 2024

📌 Commit f4d30b1 has been approved by fmease

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 23, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Mar 23, 2024
Handle str literals written with `'` lexed as lifetime

Given `'hello world'` and `'1 str', provide a structured suggestion for a valid string literal:

```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-3.rs:2:26
   |
LL |     println!('hello world');
   |                          ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("hello world");
   |              ~           ~
```
```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-1.rs:2:20
   |
LL |     println!('1 + 1');
   |                    ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("1 + 1");
   |              ~     ~
```

Fix rust-lang#119685.
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 23, 2024
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#116016 (Soft-destabilize `RustcEncodable` & `RustcDecodable`, remove from prelude in next edition)
 - rust-lang#121281 (regression test for rust-lang#103626)
 - rust-lang#122168 (Fix validation on substituted callee bodies in MIR inliner)
 - rust-lang#122217 (Handle str literals written with `'` lexed as lifetime)
 - rust-lang#122379 (transmute: caution against int2ptr transmutation)
 - rust-lang#122907 (Uniquify `ReError` on input mode in canonicalizer)
 - rust-lang#122942 (Add test in higher ranked subtype)
 - rust-lang#122943 (add a couple more ice tests)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 23, 2024
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#116016 (Soft-destabilize `RustcEncodable` & `RustcDecodable`, remove from prelude in next edition)
 - rust-lang#121281 (regression test for rust-lang#103626)
 - rust-lang#122168 (Fix validation on substituted callee bodies in MIR inliner)
 - rust-lang#122217 (Handle str literals written with `'` lexed as lifetime)
 - rust-lang#122379 (transmute: caution against int2ptr transmutation)
 - rust-lang#122907 (Uniquify `ReError` on input mode in canonicalizer)
 - rust-lang#122942 (Add test in higher ranked subtype)
 - rust-lang#122943 (add a couple more ice tests)

r? `@ghost`
`@rustbot` modify labels: rollup
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Mar 24, 2024
Handle str literals written with `'` lexed as lifetime

Given `'hello world'` and `'1 str', provide a structured suggestion for a valid string literal:

```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-3.rs:2:26
   |
LL |     println!('hello world');
   |                          ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("hello world");
   |              ~           ~
```
```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-1.rs:2:20
   |
LL |     println!('1 + 1');
   |                    ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("1 + 1");
   |              ~     ~
```

Fix rust-lang#119685.
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 24, 2024
…kingjubilee

Rollup of 13 pull requests

Successful merges:

 - rust-lang#121281 (regression test for rust-lang#103626)
 - rust-lang#121940 (Mention Register Size in `#[warn(asm_sub_register)]`)
 - rust-lang#122217 (Handle str literals written with `'` lexed as lifetime)
 - rust-lang#122379 (transmute: caution against int2ptr transmutation)
 - rust-lang#122460 (Rework rmake support library API)
 - rust-lang#122797 (Fix compile of wasm64-unknown-unknown target)
 - rust-lang#122875 (CFI: Support self_cell-like recursion)
 - rust-lang#122879 (CFI: Strip auto traits off Virtual calls)
 - rust-lang#122895 (add some ice tests 5xxxx to 9xxxx)
 - rust-lang#122907 (Uniquify `ReError` on input mode in canonicalizer)
 - rust-lang#122923 (In `pretty_print_type()`, print `async fn` futures' paths instead of spans.)
 - rust-lang#122942 (Add test in higher ranked subtype)
 - rust-lang#122963 (core/panicking: fix outdated comment)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 24, 2024
…iaskrgr

Rollup of 9 pull requests

Successful merges:

 - rust-lang#121281 (regression test for rust-lang#103626)
 - rust-lang#122168 (Fix validation on substituted callee bodies in MIR inliner)
 - rust-lang#122217 (Handle str literals written with `'` lexed as lifetime)
 - rust-lang#122379 (transmute: caution against int2ptr transmutation)
 - rust-lang#122840 (`rustdoc --test`: Prevent reaching the maximum size of command-line by using files for arguments if there are too many)
 - rust-lang#122907 (Uniquify `ReError` on input mode in canonicalizer)
 - rust-lang#122942 (Add test in higher ranked subtype)
 - rust-lang#122943 (add a couple more ice tests)
 - rust-lang#122963 (core/panicking: fix outdated comment)

r? `@ghost`
`@rustbot` modify labels: rollup
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Mar 24, 2024
Handle str literals written with `'` lexed as lifetime

Given `'hello world'` and `'1 str', provide a structured suggestion for a valid string literal:

```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-3.rs:2:26
   |
LL |     println!('hello world');
   |                          ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("hello world");
   |              ~           ~
```
```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-1.rs:2:20
   |
LL |     println!('1 + 1');
   |                    ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("1 + 1");
   |              ~     ~
```

Fix rust-lang#119685.
@bors bors merged commit 1164c27 into rust-lang:master Mar 24, 2024
11 checks passed
@rustbot rustbot added this to the 1.79.0 milestone Mar 24, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Mar 24, 2024
Rollup merge of rust-lang#122217 - estebank:issue-119685, r=fmease

Handle str literals written with `'` lexed as lifetime

Given `'hello world'` and `'1 str', provide a structured suggestion for a valid string literal:

```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-3.rs:2:26
   |
LL |     println!('hello world');
   |                          ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("hello world");
   |              ~           ~
```
```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-1.rs:2:20
   |
LL |     println!('1 + 1');
   |                    ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("1 + 1");
   |              ~     ~
```

Fix rust-lang#119685.
RenjiSann pushed a commit to RenjiSann/rust that referenced this pull request Mar 25, 2024
Handle str literals written with `'` lexed as lifetime

Given `'hello world'` and `'1 str', provide a structured suggestion for a valid string literal:

```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-3.rs:2:26
   |
LL |     println!('hello world');
   |                          ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("hello world");
   |              ~           ~
```
```
error[E0762]: unterminated character literal
  --> $DIR/lex-bad-str-literal-as-char-1.rs:2:20
   |
LL |     println!('1 + 1');
   |                    ^^^^
   |
help: if you meant to write a `str` literal, use double quotes
   |
LL |     println!("1 + 1");
   |              ~     ~
```

Fix rust-lang#119685.
RenjiSann pushed a commit to RenjiSann/rust that referenced this pull request Mar 25, 2024
…iaskrgr

Rollup of 9 pull requests

Successful merges:

 - rust-lang#121281 (regression test for rust-lang#103626)
 - rust-lang#122168 (Fix validation on substituted callee bodies in MIR inliner)
 - rust-lang#122217 (Handle str literals written with `'` lexed as lifetime)
 - rust-lang#122379 (transmute: caution against int2ptr transmutation)
 - rust-lang#122840 (`rustdoc --test`: Prevent reaching the maximum size of command-line by using files for arguments if there are too many)
 - rust-lang#122907 (Uniquify `ReError` on input mode in canonicalizer)
 - rust-lang#122942 (Add test in higher ranked subtype)
 - rust-lang#122943 (add a couple more ice tests)
 - rust-lang#122963 (core/panicking: fix outdated comment)

r? `@ghost`
`@rustbot` modify labels: rollup
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request May 28, 2024
Don't suggest turning non-char-literal exprs of ty `char` into string literals

Fixes rust-lang#125595.
Fixes rust-lang#125081.

r? estebank (rust-lang#122217) or compiler
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request May 28, 2024
Rollup merge of rust-lang#125640 - fmease:plz-no-stringify, r=estebank

Don't suggest turning non-char-literal exprs of ty `char` into string literals

Fixes rust-lang#125595.
Fixes rust-lang#125081.

r? estebank (rust-lang#122217) or compiler
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Diagnostics could be more helpful when confusing character and string literals
6 participants