Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc hangs after ICEing due to memory limit #110771

Closed
matthiaskrgr opened this issue Apr 24, 2023 · 4 comments · Fixed by #110975
Closed

rustc hangs after ICEing due to memory limit #110771

matthiaskrgr opened this issue Apr 24, 2023 · 4 comments · Fixed by #110975
Labels
C-bug Category: This is a bug. P-medium Medium priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. regression-untriaged Untriaged performance or correctness regression. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@matthiaskrgr
Copy link
Member

matthiaskrgr commented Apr 24, 2023

I use prlimit to limit rustcs memory usage and max runtime per thread (like a light sandboxing basically) and I have a giant
files.par_iter().for_each(|file| rustc_flags.iter(|flag| exec( prlimit_run rustc flag file.rs))).collect::vec<ICE>(); loop

I run prlimit --noheadings --as=3076000000 --cpu=30 rustc file flags, so a rustc process is limited to roughly 3 gb of ram and 30 seconds of cpu time.

I noticed that after 39cf520 #109507 (I bisected this), my rayon loop would sometimes get stuck randomly.
It was very weird because there was no cpu load, just as if someone had temporarily suspended one of the rayon threads and now we'd wait for it/them to finish indefinitely.

I can reproduce the problem tests/ui/limits/huge-struct.rs for example:

prlimit --noheadings --as=3076000000 --cpu=30 /home/matthias/.rustup/toolchains/master/bin/rustc /tmp/IM/huge-struct.rs -Zdump-mir-dir=/tmp/icemaker_global_tempdir.lyoD6fyhTfAh/rustc_testrunner_tmpdir.VNlnLWzycl69 -Zno-codegen -Zunstable-options -Zvalidate-mir -Zverify-llvm-ir=yes -Zincremental-verify-ich=yes -Zmir-opt-level=0 -Zmir-opt-level=1 -Zmir-opt-level=2 -Zmir-opt-level=3 -Zmir-opt-level=5 -Zunsound-mir-opts -Zdump-mir=all --emit=mir -Zprint-mono-items=eager -Zpolymorphize=on -Zalways-encode-mir -Zdrop-tracking -Zdrop-tracking-mir=yes -Zverbose -Zextra-const-ub-checks --edition=2018 -Ztranslate-lang=en_US -Zprint-type-sizes -Zmaximal-hir-to-mir-coverage -Zstrict-init-checks=yes '-Zcrate-attr=feature(return_type_notation)' '-Zcrate-attr=feature(async_fn_in_trait)' '-Zcrate-attr=feature(impl_trait_in_assoc_type)' '-Zcrate-attr=feature(transmute_generic_consts)' '-Zcrate-attr=feature(fn_ptr_trait)'

This causes rustc to be killed by prlimit because it would hit the memory limit, sometimes its backtrace is much shorter than usual

thread 'rustc' panicked at 'memory allocation of 671088640 bytes failed', library/alloc/src/alloc.rs:412:17
stack backtrace:
   0:     0x7fb16cf68f33 - std::backtrace_rs::backtrace::libunwind::trace::h93382db32298e592
                               at /rustc/64bcb326516ef7490db46de88b87a4c0990097fe/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x7fb16cf68f33 - std::backtrace_rs::backtrace::trace_unsynchronized::h077b1367e10a1417
                               at /rustc/64bcb326516ef7490db46de88b87a4c0990097fe/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x7fb16cf68f33 - std::sys_common::backtrace::_print_fmt::h68f17e98ca35b7bf
                               at /rustc/64bcb326516ef7490db46de88b87a4c0990097fe/library/std/src/sys_common/backtrace.rs:65:5
   3:     0x7fb16cf68f33 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h341120b670d06b15
                               at /rustc/64bcb326516ef7490db46de88b87a4c0990097fe/library/std/src/sys_common/backtrace.rs:44:22
   4:     0x7fb16cfc9d4f - core::fmt::write::ha614952dcf5c10f0
                               at /rustc/64bcb326516ef7490db46de88b87a4c0990097fe/library/core/src/fmt/mod.rs:1247:17
   5:     0x7fb16cf5bf61 - std::io::Write::write_fmt::h10934ba4b215c50c
                               at /rustc/64bcb326516ef7490db46de88b87a4c0990097fe/library/std/src/io/mod.rs:1712:15
   6:     0x7fb16cf68d45 - std::sys_common::backtrace::_print::heb0370ce5b64b518
                               at /rustc/64bcb326516ef7490db46de88b87a4c0990097fe/library/std/src/sys_common/backtrace.rs:47:5
   7:     0x7fb16cf68d45 - std::sys_common::backtrace::print::h98331c1359b10e7b
                               at /rustc/64bcb326516ef7490db46de88b87a4c0990097fe/library/std/src/sys_common/backtrace.rs:34:9
   8:     0x7fb16cf6b84f - std::panicking::default_hook::{{closure}}::h5b08409ff035083a
   9:     0x7fb16cf6b507 - std::panicking::default_hook::hfc3c525b7b9bf324
                               at /rustc/64bcb326516ef7490db46de88b87a4c0990097fe/library/std/src/panicking.rs:293:9
thread 'rustc' panicked at 'memory allocation of 6291456 bytes failed', library/alloc/src/alloc.rs:412:17

and my console would not be freed, so there is still something running.
You may need to run this a couple of times to hit the hang.

@matthiaskrgr matthiaskrgr added regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Apr 24, 2023
@rustbot rustbot added I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Apr 24, 2023
@matthiaskrgr
Copy link
Member Author

cc @Amanieu

@the8472
Copy link
Member

the8472 commented Apr 24, 2023

Maybe related, maybe not. I just ran into a hang in backtraces when working on something else:

(gdb) bt
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x00007fafc1163de5 in std::sys::unix::futex::futex_wait () at library/std/src/sys/unix/futex.rs:62
#2  0x00007fafc115c3a0 in std::sys::unix::locks::futex_mutex::Mutex::lock_contended () at library/std/src/sys/unix/locks/futex_mutex.rs:56
#3  0x00007fafc1161986 in std::sys::unix::locks::futex_mutex::Mutex::lock () at library/std/src/sys/unix/locks/futex_mutex.rs:28
#4  std::sync::mutex::Mutex::lock<()> () at library/std/src/sync/mutex.rs:273
#5  0x00007fafc1174d65 in std::sys_common::backtrace::lock () at library/std/src/sys_common/backtrace.rs:17
#6  std::sys_common::backtrace::print () at library/std/src/sys_common/backtrace.rs:33
#7  0x00007fafc114a8f6 in std::panicking::default_hook::{closure#1} () at library/std/src/panicking.rs:274
#8  0x00007fafc114a52f in std::panicking::default_hook () at library/std/src/panicking.rs:293
#9  0x00007fafc4cbe7ff in rustc_driver_impl::DEFAULT_HOOK::{closure#0}::{closure#0} () at compiler/rustc_driver_impl/src/lib.rs:1208
#10 core::ops::function::FnOnce::call_once<rustc_driver_impl::DEFAULT_HOOK::{closure#0}::{closure_env#0}, (&core::panic::panic_info::PanicInfo)> () at library/core/src/ops/function.rs:250
#11 core::ops::function::FnOnce::call_once<rustc_driver_impl::DEFAULT_HOOK::{closure#0}::{closure_env#0}, (&core::panic::panic_info::PanicInfo)> () at library/core/src/ops/function.rs:250
#12 0x00007fafc114b0d3 in std::panicking::rust_panic_with_hook () at library/std/src/panicking.rs:704
#13 0x00007fafc117558f in std::panicking::begin_panic_handler::{closure#0} () at library/std/src/panicking.rs:586
#14 0x00007fafc1175086 in std::sys_common::backtrace::__rust_end_short_backtrace<std::panicking::begin_panic_handler::{closure_env#0}, !> () at library/std/src/sys_common/backtrace.rs:150
#15 0x00007fafc114aa92 in std::panicking::begin_panic_handler () at library/std/src/panicking.rs:584
#16 0x00007fafc11c3843 in core::panicking::panic_fmt () at library/core/src/panicking.rs:67
#17 0x00007fafc11c38dd in core::panicking::panic () at library/core/src/panicking.rs:117
#18 0x00007fafc1152dd9 in alloc::vec::Vec::extend_desugared<std::backtrace_rs::symbolize::gimli::elf::ParsedSym, alloc::alloc::Global, core::iter::adapters::map::Map<core::iter::adapters::filter::Filter<core::iter::adapters::filter::Filter<core::slice::iter::Iter<object::elf::Sym64<object::endian::LittleEndian>>, std::backtrace_rs::symbolize::gimli::elf::{impl#1}::parse::{closure_env#0}>, std::backtrace_rs::symbolize::gimli::elf::{impl#1}::parse::{closure_env#1}>, std::backtrace_rs::symbolize::gimli::elf::{impl#1}::parse::{closure_env#2}>> () at library/alloc/src/vec/mod.rs:2827
#19 alloc::vec::spec_extend::{impl#0}::spec_extend<std::backtrace_rs::symbolize::gimli::elf::ParsedSym, core::iter::adapters::map::Map<core::iter::adapters::filter::Filter<core::iter::adapters::filter::Filter<core::slice::iter::Iter<object::elf::Sym64<object::endian::LittleEndian>>, std::backtrace_rs::symbolize::gimli::elf::{impl#1}::parse::{closure_env#0}>, std::backtrace_rs::symbolize::gimli::elf::{impl#1}::parse::{closure_env#1}>, std::backtrace_rs::symbolize::gimli::elf::{impl#1}::parse::{closure_env#2}>, alloc::alloc::Global> () at library/alloc/src/vec/spec_extend.rs:17

In my case this was due to a logic bug I introduced, but the same place could panic due to allocation failure.

@Amanieu
Copy link
Member

Amanieu commented Apr 25, 2023

I can't reproduce this on my machine.

Looking at the backtrace from @the8472, this seems to be happening because a second panic occurred while executing a panic hook, which led to a deadlock on an internal lock used by the backtrace mechanism. I think the proper solution here is to skip executing the panic hook and immediately abort if a panic occurred while executing a panic hook.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Apr 25, 2023
…manieu

Revert panic oom

This temporarily reverts rust-lang#109507 until rust-lang#110771 is addressed

r? `@Amanieu`
@apiraino
Copy link
Contributor

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium +T-compiler

@rustbot rustbot added P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Apr 25, 2023
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue May 15, 2023
Rework handling of recursive panics

This PR makes 2 changes to how recursive panics works (a panic while handling a panic).

1. The panic count is no longer used to determine whether to force an immediate abort. This allows code like the following to work without aborting the process immediately:

```rust
struct Double;

impl Drop for Double {
    fn drop(&mut self) {
        // 2 panics are active at once, but this is fine since it is caught.
        std::panic::catch_unwind(|| panic!("twice"));
    }
}

let _d = Double;

panic!("once");
```

Rustc already generates appropriate code so that any exceptions escaping out of a `Drop` called in the unwind path will immediately abort the process.

2. Any panics while the panic hook is executing will force an immediate abort. This is necessary to avoid potential deadlocks like rust-lang#110771 where a panic happens while holding the backtrace lock. We don't even try to print the panic message in this case since the panic may have been caused by `Display` impls.

Fixes rust-lang#110771
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue May 16, 2023
Rework handling of recursive panics

This PR makes 2 changes to how recursive panics works (a panic while handling a panic).

1. The panic count is no longer used to determine whether to force an immediate abort. This allows code like the following to work without aborting the process immediately:

```rust
struct Double;

impl Drop for Double {
    fn drop(&mut self) {
        // 2 panics are active at once, but this is fine since it is caught.
        std::panic::catch_unwind(|| panic!("twice"));
    }
}

let _d = Double;

panic!("once");
```

Rustc already generates appropriate code so that any exceptions escaping out of a `Drop` called in the unwind path will immediately abort the process.

2. Any panics while the panic hook is executing will force an immediate abort. This is necessary to avoid potential deadlocks like rust-lang#110771 where a panic happens while holding the backtrace lock. We don't even try to print the panic message in this case since the panic may have been caused by `Display` impls.

Fixes rust-lang#110771
Noratrieb added a commit to Noratrieb/rust that referenced this issue May 16, 2023
Rework handling of recursive panics

This PR makes 2 changes to how recursive panics works (a panic while handling a panic).

1. The panic count is no longer used to determine whether to force an immediate abort. This allows code like the following to work without aborting the process immediately:

```rust
struct Double;

impl Drop for Double {
    fn drop(&mut self) {
        // 2 panics are active at once, but this is fine since it is caught.
        std::panic::catch_unwind(|| panic!("twice"));
    }
}

let _d = Double;

panic!("once");
```

Rustc already generates appropriate code so that any exceptions escaping out of a `Drop` called in the unwind path will immediately abort the process.

2. Any panics while the panic hook is executing will force an immediate abort. This is necessary to avoid potential deadlocks like rust-lang#110771 where a panic happens while holding the backtrace lock. We don't even try to print the panic message in this case since the panic may have been caused by `Display` impls.

Fixes rust-lang#110771
@bors bors closed this as completed in f91b634 May 27, 2023
thomcc pushed a commit to tcdi/postgrestd that referenced this issue Jul 18, 2023
Revert panic oom

This temporarily reverts rust-lang/rust#109507 until rust-lang/rust#110771 is addressed

r? `@Amanieu`
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 20, 2024
Revert panic oom

This temporarily reverts rust-lang/rust#109507 until rust-lang/rust#110771 is addressed

r? `@Amanieu`
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 27, 2024
Revert panic oom

This temporarily reverts rust-lang/rust#109507 until rust-lang/rust#110771 is addressed

r? `@Amanieu`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. P-medium Medium priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. regression-untriaged Untriaged performance or correctness regression. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants