Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miri hangs when a test failed #2273

Closed
RalfJung opened this issue Jun 26, 2022 · 14 comments
Closed

Miri hangs when a test failed #2273

RalfJung opened this issue Jun 26, 2022 · 14 comments
Labels
A-aliasing Area: This affects the aliasing model (Stacked/Tree Borrows) C-bug Category: This is a bug. I-slow Impact: Makes Miri even slower than it already is

Comments

@RalfJung
Copy link
Member

RalfJung commented Jun 26, 2022

When a test failed (not with UB, but with a regular assertion failure), and libtest goes to print the failed tests, something goes wrong and Miri just hangs. I am not sure if this always happens, but I just got it in crossbeam and I seem to recall @saethlin also mentioning this.

To reproduce:

Using -Zmiri-progress-report, I can tell that interpretation itself grinds to a halt -- usually it should print progress every few seconds, but it doesn't, since operations seem to start taking a lot longer. By reducing the progress interval I got this backtrace:

   --> /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/slice/mod.rs:405:20
    |
405 |         unsafe { &*index.get_unchecked(self) }
    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^ progress report: current operation being executed is here
    |
    = note: inside `core::slice::<impl [u8]>::get_unchecked::<usize>` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/slice/mod.rs:405:20
    = note: inside `<core::str::lossy::Utf8LossyChunksIter as std::iter::Iterator>::next` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/str/lossy.rs:65:34
    = note: inside `std::string::String::from_utf8_lossy` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/alloc/src/string.rs:633:48
    = note: inside `test::formatters::pretty::PrettyFormatter::<std::io::Stdout>::write_results` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/test/src/formatters/pretty.rs:141:30
    = note: inside `test::formatters::pretty::PrettyFormatter::<std::io::Stdout>::write_failures` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/test/src/formatters/pretty.rs:164:9
    = note: inside `<test::formatters::pretty::PrettyFormatter<std::io::Stdout> as test::formatters::OutputFormatter>::write_run_finish` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/test/src/formatters/pretty.rs:248:17
    = note: inside `test::run_tests_console` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/test/src/console.rs:298:5
    = note: inside `test::test_main` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/test/src/lib.rs:113:15
    = note: inside `test::test_main_static` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/test/src/lib.rs:132:5
    = note: inside `main`

Possibly #1935 will help.

@RalfJung RalfJung added C-bug Category: This is a bug. A-aliasing Area: This affects the aliasing model (Stacked/Tree Borrows) I-slow Impact: Makes Miri even slower than it already is labels Jun 26, 2022
@saethlin
Copy link
Member

saethlin commented Jun 26, 2022

I'm trying to use this as a test case and now the test in question is passing.

Here's a useful program:

fn main() {
    leaf(100);
}

fn leaf(x: usize) {
    if x == 0 {
        panic!("ouch");
    }
    leaf(x - 1);
}

#[test]
fn oof() {
    panic!("ouch");
}

So based on this, the stack depth is irrelevant (cargo miri run prints slowly, cargo miri test hangs), which is comforting. The problem is definitely related to libtests and that code path through PrettyFormatter specifically.

I actually added the backtraces bench-cargo-miri project to target this general situation, and at the time I didn't realize there are two different problems here. More motivation to go deal with that rebase I guess :p

@RalfJung
Copy link
Member Author

there are two different problems here

There are?

@saethlin
Copy link
Member

Ah sorry, speaking from my particular perspective on SB performance 😅

From what I recall, just the Debug impl that gets invoked to print a backtrace is pathological with respect to the caching I'm trying to add in #1935.

However, that PR does finish running these tests in ~46 seconds which makes it a >10x speedup over the current situation. Which I think is due to a big speedup along the code path you pointed out. Though we'll know for sure when I get around to rebasing the PR.

@Kixiron
Copy link
Member

Kixiron commented Jun 27, 2022

I made a PR that changes Utf8LossyChunksIter::next() to use pointers instead of references and it seems to completely alleviate the test case from #2273 (comment), dropping the runtime down to ~5 seconds on my machine

@saethlin
Copy link
Member

Here is another fun program:

fn main() {
    let x: &[u8] = &[0u8; 256];
    let start = std::time::Instant::now();
    for i in 0..x.len() {
        //let _item = unsafe { *x.get_unchecked(i) };
        let _item = x[i];
    }
    println!("{:?}", start.elapsed());

    let x: &[u8] = &[0u8; 512];
    let start = std::time::Instant::now();
    for i in 0..x.len() {
        //let _item = unsafe { *x.get_unchecked(i) };
        let _item = x[i];
    }
    println!("{:?}", start.elapsed());

    let x: &[u8] = &[0u8; 1024];
    let start = std::time::Instant::now();
    for i in 0..x.len() {
        //let _item = unsafe { *x.get_unchecked(i) };
        let _item = x[i];
    }
    println!("{:?}", start.elapsed());
}

On my machine, this prints:

22.68581ms
43.578931ms
85.368123ms

Very reasonable, ~2x increase in runtime per doubling of the array size.
But if I swap in all the get_unchecked for the square bracket indexing, I end up with:

58.717391ms
319.230435ms
2.783580145s

That's more like a ~8x increase per doubling of the array size. Ouch. If I stick in a x.len() call to model the behavior of the iterator that powers String::from_utf8_lossy, I see a further ~4x regression. So some quick estimation based on the amount of output we get from the a RUST_BACKTRACE=full test failure suggests the backtrace should finish printing in about 3 hours.

Also, String::from_utf8_lossy has the same runtime and runtime scaling as just looping over a slice and doing len() + get_unchecked(i).

So if I had to go for a minimal diff to address the backtrace printing issue, hoist the length and a pointer to the slice into a local outside of the loop, and use pointer arithmetic to read each element in the ASCII path instead of calling get_unchecked.

I would really like to do something about this on the Miri end, but I'm not sure what to do. The debugging support I slapped together to figure out what tags are being added might be a start?

@RalfJung
Copy link
Member Author

What happens with something like this?

let elem = &x[i];
let _item = *elem;

Then we do still get a reborrow for each iteration. But we avoid the large reborrow of x, which I think is the problem.

@saethlin
Copy link
Member

Same runtimes within noise as let _item = x[i]. Yes, I agree that these large reborrows are the problem.

@RalfJung
Copy link
Member Author

Note that with &x[i], we do already end up with a unique borrow stack per location. So that is not, in itself, the problem.

I think there is, in theory, an optimization we can do here, but I am not sure how practical it is.
When we reborrow a tag that is SharedReadOnly everywhere from another one, and both tags are in the borrow stack of the exact same locations, then we can just keep using the original tag and don't need to generate+push a new one. Basically, two SharedReadOnly tags that are in the same stacks don't differentiate anything, they will always be invalidated at the exact same moment.

bors added a commit that referenced this issue Jul 13, 2022
Add a benchmark of the hang-on-test-failure code path

This is the code pattern that produces the performance problem in #2273

I figured out what I was stuck on in #2315 (comment). For a while I was just doing `let x: &[u8] = &[0u8; 4096];` but that doesn't produce the runtime inside `Stack::item_popped` that I was looking for, I think because this allocation is never deallocated. But with `Vec`, I get the profile I'm looking for.
bors added a commit to rust-lang-ci/rust that referenced this issue Sep 3, 2022
…mulacrum

Update backtrace

Most notably, this pulls in rust-lang/backtrace-rs@ebc9a85, which is both a bugfix and a significant performance improvement for Miri, since fixing the bug makes `RUST_BACKTRACE=1` backtraces much smaller: rust-lang/miri#2273, rust-lang/miri-test-libstd#8

This also pulls in other commits which turn the backtrace-rs CI green. That's nice.
workingjubilee pushed a commit to tcdi/postgrestd that referenced this issue Sep 15, 2022
Update backtrace

Most notably, this pulls in rust-lang/backtrace-rs@ebc9a85, which is both a bugfix and a significant performance improvement for Miri, since fixing the bug makes `RUST_BACKTRACE=1` backtraces much smaller: rust-lang/miri#2273, rust-lang/miri-test-libstd#8

This also pulls in other commits which turn the backtrace-rs CI green. That's nice.
@saethlin
Copy link
Member

Between the stack cache and the tag GC, on my machine Miri will execute a simple Cargo project with a test that just panics in 2.3 seconds with RUST_BACKTRACE=1 and 6.6 seconds with RUST_BACKTRACE=full (the tag GC makes full backtraces much faster, it has basically no effect on the pruned ones). It's still not speedy, but I think we're comfortably in the range where this won't be confused with a hang. Do you agree?

@RalfJung
Copy link
Member Author

Yeah agreed.

What does this program with square bracket indexing now print (without isolation)?

@saethlin
Copy link
Member

25.074065ms
44.571429ms
87.756986ms

@RalfJung
Copy link
Member Author

Sorry, I meant with get_unchecked. For me that shows

# primitive indexing
31.261897ms
61.30079ms
122.615057ms

# get_unchecked
84.539718ms
221.671185ms
441.900342ms

So, it's still super-linear, but it's a lot better than before. Awesome!

@saethlin
Copy link
Member

Is it super-linear? I see this

35.167154ms
67.234414ms
131.999157ms

Which is almost exactly doubling of runtime when we double the size of the array.

(I would not expect the tag GC to fix all such cases of super-linear behavior, it's a bit conservative and there's also the RangeMap splitting)

@RalfJung
Copy link
Member Author

The "time divided by array size" in my measurement is

  • 0,33
  • 0,43
  • 0,43

🤷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-aliasing Area: This affects the aliasing model (Stacked/Tree Borrows) C-bug Category: This is a bug. I-slow Impact: Makes Miri even slower than it already is
Projects
None yet
Development

No branches or pull requests

3 participants