Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing tests and deadlocks on linux/aarch64 (Amazon Graviton2) #687

Open
ulfworsoe opened this issue Dec 16, 2021 · 14 comments · Fixed by #697
Open

Failing tests and deadlocks on linux/aarch64 (Amazon Graviton2) #687

ulfworsoe opened this issue Dec 16, 2021 · 14 comments · Fixed by #697

Comments

@ulfworsoe
Copy link

Found on:

  • Linux (Ubuntu 20.04.3) on aarch64 (Amazon Graviton2)
  • TBB version 2021.4.0 and master (883c2e5)
  • building with cmake+ninja (CC=clang CXX=clang++ cmake -GNinja -DCMAKE_INSTALL_PREFIX=$HOME/local/2021.4.0 -DCMAKE_BUILD_TYPE=Release).

Running the TBB tests, some tests failed:

  • In master and 2021.4.0 test_composite_node appear to hang, killed manually after ~1000 seconds
  • In 2021.4.0 test_concurrent_vector appear to hang, killed manually after ~1000 seconds
  • test_eh_thead aborts
@erling-d-andersen
Copy link

More info in the duplicate but closed issue: #688

@odidev
Copy link

odidev commented Dec 16, 2021

@anton-potapov

I have been working on installing and testing this package for amd64 and arm64 architectures. While testing I am getting 2 tests failure on my local arm server. It would be really helpful if you could share some pointers on it.

98% tests passed; 2 tests failed out of 133 
Total Test time (real) = 476.32 sec 
The following tests FAILED: 
         28 - test_resumable_tasks (SEGFAULT) 
         60 - test_eh_thread (Subprocess aborted) 
Errors while running CTest

Log for reference: oneTBB_tests_arm64.txt

@alexey-katranov
Copy link
Contributor

test_eh_thread seems stange. It could not created 1024 and received an error from pthread_create but then oneTBB was able to create one more thread. Try to set a break point to pthread_create inside rml_thread_monitor.h and check why it does not fail.

As for other tests, it is quite difficult to suppose what is going wrong (supposedly, it might be related to relaxed memory model of aarch64). Is possible to share core dumps of hanged tests?

@erling-d-andersen
Copy link

Due to Christmas vacation we are not likely to return before January. Sorry.

@phprus
Copy link
Contributor

phprus commented Dec 22, 2021

@alexey-katranov, @anton-potapov Error in test_eh_thread - is not oneTBB issue.

Hardware: Raspberry Pi 4 8GB RAM.
OS:

  1. Ubuntu 20.04.3 (kernel 5.4.0-1047-raspi) + docker, container: oraclelinux:8.5
  2. Ubuntu 20.04.3 (kernel 5.4.0-1047-raspi) without any containers.

Test:

// g++ -pthread -std=c++17 pthread.cpp && ./a.out && echo OK || echo ERROR

#include <algorithm>
#include <atomic>
#include <condition_variable>
#include <thread>
#include <vector>
#include <iostream>

#include <sys/types.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <unistd.h>


void limitThreads(size_t limit)
{
    rlimit rlim;

    int ret = getrlimit(RLIMIT_NPROC, &rlim);
    if (ret != 0)
    {
        std::cerr << "getrlimit has returned an error" << std::endl;
        exit(1);
    }

    rlim.rlim_cur = (rlim.rlim_max == (rlim_t)RLIM_INFINITY) ? limit : std::min(limit, rlim.rlim_max);

    ret = setrlimit(RLIMIT_NPROC, &rlim);
    if (ret != 0)
    {
        std::cerr << "setrlimit has returned an error" << std::endl;
        exit(1);
    }
}

static std::mutex m;
static std::condition_variable cv;
static std::atomic<bool> stop{ false };

static void* thread_routine(void*)
{
    std::unique_lock<std::mutex> lock(m);
    cv.wait(lock, [] { return stop == true; });
    return 0;
}

static void* new_thread_routine(void*)
{
    std::cerr << "sleep" << std::endl;
    sleep(60);
    return 0;
}

class Thread {
    pthread_t mHandle{};
    bool mValid{};
public:
    Thread() {
        mValid = false;
        pthread_attr_t attr;
        // Limit the stack size not to consume all virtual memory on 32 bit platforms.
        if (pthread_attr_init(&attr) == 0 && pthread_attr_setstacksize(&attr, 100*1024) == 0) {
            mValid = pthread_create(&mHandle, &attr, thread_routine, /* arg = */ nullptr) == 0;
        }
    }
    bool isValid() const { return mValid; }
    void join() {
        pthread_join(mHandle, nullptr);
    }
};


void check( int error_code, const char* routine )
{
    if (error_code)
    {
        std::cerr << routine << std::endl;
        _exit(1);
    }
}


int main()
{
    // Some systems set really big limit (e.g. >45К) for the number of processes/threads
    limitThreads(1024);

    std::thread /* isolate test */ ([] {
        std::vector<Thread> threads;
        stop = false;
        auto finalize = [&] {
            stop = true;
            cv.notify_all();
            for (auto& t : threads) {
                t.join();
            }
        };

        for (int i = 0;; ++i) {
            Thread thread;
            if (!thread.isValid()) {
                break;
            }
            threads.push_back(thread);
            if (i == 1024) {
                std::cerr << "setrlimit seems having no effect" << std::endl;
                finalize();
                return;
            }
        }

        pthread_t new_handle;
        pthread_attr_t s;
        check(pthread_attr_init( &s ), "pthread_attr_init has failed");
        pthread_t handle;
        void * arg = nullptr;

        int ec = pthread_create( &new_handle, &s, new_thread_routine, arg );
        if (ec) {
            std::cerr << "EXPECTED ERROR: " << "pthread_create has failed" << std::endl;
        } else {
            std::cerr << "UNEXPECTED OK: " << "pthread_create is not failed" << std::endl;
            _exit(1);
        }
        check( pthread_attr_destroy( &s ), "pthread_attr_destroy has failed" );
        pthread_join(new_handle, nullptr);

        finalize();
    }).join();

    return 0;
}

Output:

[root@ubuntu ~]# g++ -pthread -std=c++17 pthread.cpp
[root@ubuntu ~]# ./a.out && echo OK || echo ERROR
sleepUNEXPECTED OK: pthread_create is not failed
ERROR

Maybe bug in glibc or in ubuntu 20.04 kernel? ... Or in my test?

@alexey-katranov
Copy link
Contributor

Maybe if (pthread_attr_init(&attr) == 0 && pthread_attr_setstacksize(&attr, 100*1024) == 0) failed (not pthread_create) when we tried to create 1024 threads?

@phprus
Copy link
Contributor

phprus commented Dec 22, 2021

@alexey-katranov YES!!!
pthread_attr_setstacksize returns EINVAL.

https://man7.org/linux/man-pages/man3/pthread_attr_setstacksize.3.html:

EINVAL The stack size is less than PTHREAD_STACK_MIN bytes.
On some systems, pthread_attr_setstacksize() can fail with the error EINVAL if stacksize is not a multiple of the system page size.

PTHREAD_STACK_MIN is present in <limits.h>

On aarch64 PTHREAD_STACK_MIN == 131072 or dynamic value: sysconf(_SC_THREAD_STACK_MIN).

phprus added a commit to phprus/oneTBB that referenced this issue Dec 23, 2021
Signed-off-by: Vladislav Shchapov <[email protected]>
isaevil pushed a commit that referenced this issue Dec 23, 2021
Signed-off-by: Vladislav Shchapov <[email protected]>
@phprus
Copy link
Contributor

phprus commented Dec 23, 2021

@isaevil PR #697 fix only test_eh_thead test, but tests test_composite_node, test_concurrent_vector and test_resumable_tasks is not fixed by it.
Please, reopen this issue.

@isaevil
Copy link
Contributor

isaevil commented Dec 23, 2021

@phprus I guess issue was closed automatically because you mentioned it. Reopening.

@isaevil isaevil reopened this Dec 23, 2021
@phprus
Copy link
Contributor

phprus commented Dec 24, 2021

test_resumable_tasks on Raspberry Pi 4.

Release build:

[root@ubuntu gcc85-release]# ./gnu_8.5_cxx17_64_release/test_resumable_tasks
[doctest] doctest version is "2.4.6"
[doctest] run with "--help" for options
===============================================================================
/root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/test/tbb/test_resumable_tasks.cpp:431:
TEST CASE:  Nested arena

/root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/test/tbb/test_resumable_tasks.cpp:431: FATAL ERROR: test case CRASHED: SIGSEGV - Segmentation violation signal

===============================================================================
[doctest] test cases:       2 |       0 passed | 2 failed | 3 skipped
[doctest] assertions: 7000011 | 7000011 passed | 0 failed |
[doctest] Status: FAILURE!
Segmentation fault (core dumped)

[root@ubuntu gcc85-release]# gdb ./gnu_8.5_cxx17_64_release/test_resumable_tasks core
GNU gdb (GDB) Red Hat Enterprise Linux 8.2-16.0.4.el8
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./gnu_8.5_cxx17_64_release/test_resumable_tasks...(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 105481]
[New LWP 105488]
[New LWP 105476]
[New LWP 105487]
[New LWP 105489]
[New LWP 105490]
[New LWP 105482]
[New LWP 105484]
[New LWP 105486]
[New LWP 105485]
[New LWP 105483]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./gnu_8.5_cxx17_64_release/test_resumable_tasks'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000042ebfc in void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<OutermostArenaBody, int>, tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<int> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<OutermostArenaBody, int>, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<int>&, tbb::detail::d1::execution_data&) ()
[Current thread is 1 (Thread 0xffff88662010 (LWP 105481))]
(gdb) bt
#0  0x000000000042ebfc in void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<OutermostArenaBody, int>, tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<int> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<OutermostArenaBody, int>, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<int>&, tbb::detail::d1::execution_data&) ()
#1  0x000000000042f36c in tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<OutermostArenaBody, int>, tbb::detail::d1::auto_partitioner const>::execute(tbb::detail::d1::execution_data&) ()
#2  0x0000ffff8b17d260 in tbb::detail::r1::co_local_wait_for_all(unsigned int, unsigned int) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#3  0x0000ffff8ad4d140 in ?? () at ../sysdeps/unix/sysv/linux/aarch64/setcontext.S:123 from /lib64/libc.so.6
Backtrace stopped: not enough registers or memory available to unwind further
(gdb) thread 2
[Switching to thread 2 (Thread 0xffff83e45010 (LWP 105488))]
#0  0x0000ffff8add95f8 in mprotect () at ../sysdeps/unix/syscall-template.S:78
78      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
#0  0x0000ffff8add95f8 in mprotect () at ../sysdeps/unix/syscall-template.S:78
#1  0x0000ffff8b173d50 in tbb::detail::r1::create_coroutine(tbb::detail::r1::coroutine_type&, unsigned long, void*) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#2  0x0000ffff8b1790c8 in tbb::detail::r1::suspend(void (*)(void*, tbb::detail::r1::suspend_point_type*), void*) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#3  0x000000000042db2c in tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<InnermostArenaBody::InnermostInnerParFor, int>, tbb::detail::d1::auto_partitioner const>::execute(tbb::detail::d1::execution_data&) ()
#4  0x0000ffff8b17d260 in tbb::detail::r1::co_local_wait_for_all(unsigned int, unsigned int) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#5  0x0000ffff8ad4d140 in ?? () at ../sysdeps/unix/sysv/linux/aarch64/setcontext.S:123 from /lib64/libc.so.6
Backtrace stopped: not enough registers or memory available to unwind further
(gdb) thread 3
[Switching to thread 3 (Thread 0xffff8b1f3010 (LWP 105476))]
#0  __GI___mmap64 (offset=<optimized out>, fd=-1, flags=131106, prot=0, len=139264, addr=<optimized out>) at ../sysdeps/unix/sysv/linux/mmap64.c:52
52        return (void *) MMAP_CALL (mmap, addr, len, prot, flags, fd, offset);
(gdb) bt
#0  __GI___mmap64 (offset=<optimized out>, fd=-1, flags=131106, prot=0, len=139264, addr=<optimized out>) at ../sysdeps/unix/sysv/linux/mmap64.c:52
#1  __GI___mmap64 (addr=<optimized out>, len=139264, prot=0, flags=131106, fd=-1, offset=0) at ../sysdeps/unix/sysv/linux/mmap64.c:40
#2  0x0000ffff8b173d3c in tbb::detail::r1::create_coroutine(tbb::detail::r1::coroutine_type&, unsigned long, void*) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#3  0x0000ffff8b1790c8 in tbb::detail::r1::suspend(void (*)(void*, tbb::detail::r1::suspend_point_type*), void*) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#4  0x000000000042e440 in void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<InnermostArenaBody::InnermostOuterParFor, int>, tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<int> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<InnermostArenaBody::InnermostOuterParFor, int>, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<int>&, tbb::detail::d1::execution_data&) ()
#5  0x000000000042e814 in tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<InnermostArenaBody::InnermostOuterParFor, int>, tbb::detail::d1::auto_partitioner const>::execute(tbb::detail::d1::execution_data&) ()
#6  0x0000ffff8b17d260 in tbb::detail::r1::co_local_wait_for_all(unsigned int, unsigned int) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#7  0x0000ffff8ad4d140 in ?? () at ../sysdeps/unix/sysv/linux/aarch64/setcontext.S:123 from /lib64/libc.so.6
Backtrace stopped: not enough registers or memory available to unwind further
(gdb) thread 4
[Switching to thread 4 (Thread 0xffff8aa6a010 (LWP 105487))]
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0xffffef716e00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
88        int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
(gdb) bt
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0xffffef716e00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0xffffef716da8, cond=0xffffef716dd8) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0xffffef716dd8, mutex=0xffffef716da8) at pthread_cond_wait.c:655
#3  0x0000ffff8b05d738 in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>)
    at /usr/src/debug/gcc-8.5.0-4.0.1.el8_5.aarch64/obj-aarch64-redhat-linux/aarch64-redhat-linux/libstdc++-v3/include/aarch64-redhat-linux/bits/gthr-default.h:864
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x00000000004277dc in AsyncActivity::asyncLoop(AsyncActivity*) ()
#6  0x0000ffff8b063cf4 in std::execute_native_thread_routine (__p=0x29af1fd0) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x0000ffff8ae878f8 in start_thread (arg=0xffff8b063cd8 <std::execute_native_thread_routine(void*)>) at pthread_create.c:479
#8  0x0000ffff8addd1fc in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb) thread 5
[Switching to thread 5 (Thread 0xffff83dbe010 (LWP 105489))]
#0  0x0000ffff8add95f8 in mprotect () at ../sysdeps/unix/syscall-template.S:78
78      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
#0  0x0000ffff8add95f8 in mprotect () at ../sysdeps/unix/syscall-template.S:78
#1  0x0000ffff8b173d50 in tbb::detail::r1::create_coroutine(tbb::detail::r1::coroutine_type&, unsigned long, void*) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#2  0x0000ffff8b1790c8 in tbb::detail::r1::suspend(void (*)(void*, tbb::detail::r1::suspend_point_type*), void*) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#3  0x000000000042db2c in tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<InnermostArenaBody::InnermostInnerParFor, int>, tbb::detail::d1::auto_partitioner const>::execute(tbb::detail::d1::execution_data&) ()
#4  0x0000ffff8b17d260 in tbb::detail::r1::co_local_wait_for_all(unsigned int, unsigned int) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#5  0x0000ffff8ad4d140 in ?? () at ../sysdeps/unix/sysv/linux/aarch64/setcontext.S:123 from /lib64/libc.so.6
Backtrace stopped: not enough registers or memory available to unwind further
(gdb) thread 6
[Switching to thread 6 (Thread 0xffff83be3010 (LWP 105490))]
#0  0x0000ffff8b17c980 in tbb::detail::r1::co_local_wait_for_all(unsigned int, unsigned int) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
(gdb) bt
#0  0x0000ffff8b17c980 in tbb::detail::r1::co_local_wait_for_all(unsigned int, unsigned int) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#1  0x0000ffff8ad4d140 in ?? () at ../sysdeps/unix/sysv/linux/aarch64/setcontext.S:123 from /lib64/libc.so.6
Backtrace stopped: not enough registers or memory available to unwind further
(gdb) thread 7
[Switching to thread 7 (Thread 0xffff88641010 (LWP 105482))]
#0  swapcontext () at ../sysdeps/unix/sysv/linux/aarch64/swapcontext.S:90
90              cbnz    x0, 1f
(gdb) bt
#0  swapcontext () at ../sysdeps/unix/sysv/linux/aarch64/swapcontext.S:90
#1  0x0000ffff8b178e24 in tbb::detail::r1::task_dispatcher::resume(tbb::detail::r1::task_dispatcher&) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#2  0x0000ffff8b178ef4 in tbb::detail::r1::suspend(void (*)(void*, tbb::detail::r1::suspend_point_type*), void*) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#3  0x000000000042db2c in tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<InnermostArenaBody::InnermostInnerParFor, int>, tbb::detail::d1::auto_partitioner const>::execute(tbb::detail::d1::execution_data&) ()
#4  0x0000ffff8b17c368 in tbb::detail::r1::task_dispatcher::execute_and_wait(tbb::detail::d1::task*, tbb::detail::d1::wait_context&, tbb::detail::d1::task_group_context&) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#5  0x000000000042e504 in void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<InnermostArenaBody::InnermostOuterParFor, int>, tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<int> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<InnermostArenaBody::InnermostOuterParFor, int>, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<int>&, tbb::detail::d1::execution_data&) ()
#6  0x000000000042e814 in tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<InnermostArenaBody::InnermostOuterParFor, int>, tbb::detail::d1::auto_partitioner const>::execute(tbb::detail::d1::execution_data&) ()
#7  0x0000ffff8b17d260 in tbb::detail::r1::co_local_wait_for_all(unsigned int, unsigned int) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#8  0x0000ffff8ad4d140 in ?? () at ../sysdeps/unix/sysv/linux/aarch64/setcontext.S:123 from /lib64/libc.so.6
Backtrace stopped: not enough registers or memory available to unwind further
(gdb) thread 8
[Switching to thread 8 (Thread 0xffff89267010 (LWP 105484))]
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0xffffef716e04) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
88        int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
(gdb) bt
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0xffffef716e04) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0xffffef716da8, cond=0xffffef716dd8) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0xffffef716dd8, mutex=0xffffef716da8) at pthread_cond_wait.c:655
#3  0x0000ffff8b05d738 in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>)
    at /usr/src/debug/gcc-8.5.0-4.0.1.el8_5.aarch64/obj-aarch64-redhat-linux/aarch64-redhat-linux/libstdc++-v3/include/aarch64-redhat-linux/bits/gthr-default.h:864
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x00000000004277dc in AsyncActivity::asyncLoop(AsyncActivity*) ()
#6  0x0000ffff8b063cf4 in std::execute_native_thread_routine (__p=0x29af1f90) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x0000ffff8ae878f8 in start_thread (arg=0xffff8b063cd8 <std::execute_native_thread_routine(void*)>) at pthread_create.c:479
#8  0x0000ffff8addd1fc in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb) thread 9
[Switching to thread 9 (Thread 0xffff8a269010 (LWP 105486))]
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0xffffef716e00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
88        int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
(gdb) bt
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0xffffef716e00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0xffffef716da8, cond=0xffffef716dd8) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0xffffef716dd8, mutex=0xffffef716da8) at pthread_cond_wait.c:655
#3  0x0000ffff8b05d738 in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>)
    at /usr/src/debug/gcc-8.5.0-4.0.1.el8_5.aarch64/obj-aarch64-redhat-linux/aarch64-redhat-linux/libstdc++-v3/include/aarch64-redhat-linux/bits/gthr-default.h:864
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x00000000004277dc in AsyncActivity::asyncLoop(AsyncActivity*) ()
#6  0x0000ffff8b063cf4 in std::execute_native_thread_routine (__p=0x29af22b0) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x0000ffff8ae878f8 in start_thread (arg=0xffff8b063cd8 <std::execute_native_thread_routine(void*)>) at pthread_create.c:479
#8  0x0000ffff8addd1fc in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb) thread 10
[Switching to thread 10 (Thread 0xffff89a68010 (LWP 105485))]
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0xffffef716e00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
88        int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
(gdb) bt
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0xffffef716e00) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0xffffef716da8, cond=0xffffef716dd8) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=0xffffef716dd8, mutex=0xffffef716da8) at pthread_cond_wait.c:655
#3  0x0000ffff8b05d738 in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>)
    at /usr/src/debug/gcc-8.5.0-4.0.1.el8_5.aarch64/obj-aarch64-redhat-linux/aarch64-redhat-linux/libstdc++-v3/include/aarch64-redhat-linux/bits/gthr-default.h:864
#4  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:53
#5  0x00000000004277dc in AsyncActivity::asyncLoop(AsyncActivity*) ()
#6  0x0000ffff8b063cf4 in std::execute_native_thread_routine (__p=0x29af2410) at ../../../../../libstdc++-v3/src/c++11/thread.cc:80
#7  0x0000ffff8ae878f8 in start_thread (arg=0xffff8b063cd8 <std::execute_native_thread_routine(void*)>) at pthread_create.c:479
#8  0x0000ffff8addd1fc in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb) thread 11
[Switching to thread 11 (Thread 0xffff88620010 (LWP 105483))]
#0  __ieee754_log10 (x=7000009) at ../sysdeps/ieee754/dbl-64/wordsize-64/e_log10.c:62
62        EXTRACT_WORDS64 (hx, x);
(gdb) bt
#0  __ieee754_log10 (x=7000009) at ../sysdeps/ieee754/dbl-64/wordsize-64/e_log10.c:62
#1  0x0000000000412bd0 in doctest::(anonymous namespace)::ConsoleReporter::test_run_end(doctest::TestRunStats const&) ()
#2  0x0000000000415a58 in doctest::(anonymous namespace)::FatalConditionHandler::handleSignal(int) ()
#3  <signal handler called>
#4  0x000000000042edac in void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<OutermostArenaBody, int>, tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<int> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<OutermostArenaBody, int>, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<int>&, tbb::detail::d1::execution_data&) ()
#5  0x000000000042f36c in tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<int>, tbb::detail::d1::parallel_for_body_wrapper<OutermostArenaBody, int>, tbb::detail::d1::auto_partitioner const>::execute(tbb::detail::d1::execution_data&) ()
#6  0x0000ffff8b17ede0 in tbb::detail::r1::market::process(rml::job&) () from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#7  0x0000ffff8b17950c in tbb::detail::r1::rml::private_worker::thread_routine(void*) ()
   from /root/oneTBB-013035b4e9af39f506e87ae6b755c3363e768d4d/build/gcc85-release/gnu_8.5_cxx17_64_release/libtbb.so.12
#8  0x0000ffff8ae878f8 in start_thread (arg=0xffff8b179480 <tbb::detail::r1::rml::private_worker::thread_routine(void*)>) at pthread_create.c:479
#9  0x0000ffff8addd1fc in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb) thread 12
Unknown thread 12.
(gdb)

RelWithDebInfo build work without error.

@nofuturre
Copy link

@ulfworsoe is this issue still relevant?

@ulfworsoe
Copy link
Author

I don't know if it's relevant with the latest oneapi release. I'll have to check.
I'm on vacation at the moment, so if you can leave the issue open i can run the tests in a couple of weeks.

@nofuturre
Copy link

Sure, thank you for a quick response

@ulfworsoe
Copy link
Author

I have rerun the tests. I don't have access to a graviton at the moment, so it is run on a machine with similar capabilities.

  • System: Ubuntu 20.04.6 LTS
  • Machine: NVIDIA Orin Jetson
  • Repo Tag: v2021.13.0

There is one test that hangs, but I can't say if it is the same issue:

 86/137 Test  #86: conformance_parallel_for .................Subprocess terminated***Exception: 8035.38 sec

This is where it hangs:

#0  syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
#1  0x0000ffff9634a9d8 in tbb::detail::r1::futex_wait (comparand=2, futex=0xfffff6a4a870) at /home/ulfw/download/oneTBB/src/tbb/semaphore.h:253
#2  tbb::detail::r1::binary_semaphore::P (this=0xfffff6a4a870) at /home/ulfw/download/oneTBB/src/tbb/semaphore.h:253
#3  tbb::detail::r1::sleep_node<tbb::detail::r1::market_context>::wait (this=0xfffff6a4a840) at /home/ulfw/download/oneTBB/src/tbb/concurrent_monitor.h:170
#4  tbb::detail::r1::concurrent_monitor_base<tbb::detail::r1::market_context>::commit_wait (node=..., this=<optimized out>) at /home/ulfw/download/oneTBB/src/tbb/concurrent_monitor.h:232
#5  tbb::detail::r1::concurrent_monitor_base<tbb::detail::r1::market_context>::wait<tbb::detail::r1::sleep_node<tbb::detail::r1::market_context>, tbb::detail::r1::external_waiter::pause(tbb::detail::r1::arena_slot&)::{lambda()#1}&>(tbb::detail::r1::external_waiter::pause(tbb::detail::r1::arena_slot&)::{lambda()#1}&, tbb::detail::r1::sleep_node<tbb::detail::r1::market_context>&&) (node=..., pred=<synthetic pointer>..., 
    this=<optimized out>) at /home/ulfw/download/oneTBB/src/tbb/concurrent_monitor.h:262
#6  tbb::detail::r1::sleep_waiter::sleep<tbb::detail::r1::external_waiter::pause(tbb::detail::r1::arena_slot&)::{lambda()#1}>(unsigned long, tbb::detail::r1::external_waiter::pause(tbb::detail::r1::arena_slot&)::{lambda()#1}) (wakeup_condition=..., uniq_tag=281474819729704, this=0xfffff6a4a7e8) at /home/ulfw/download/oneTBB/src/tbb/waiters.h:133
#7  tbb::detail::r1::external_waiter::pause (this=0xfffff6a4a7e8) at /home/ulfw/download/oneTBB/src/tbb/waiters.h:160
#8  tbb::detail::r1::external_waiter::pause (this=<optimized out>) at /home/ulfw/download/oneTBB/src/tbb/waiters.h:153
#9  tbb::detail::r1::task_dispatcher::receive_or_steal_task<false, tbb::detail::r1::external_waiter> (critical_allowed=true, fifo_allowed=true, isolation=0, waiter=..., ed=..., tls=..., this=<optimized out>) at /home/ulfw/download/oneTBB/src/tbb/task_dispatcher.h:232
#10 tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::external_waiter> (waiter=..., t=0x0, this=0xffff95e07480) at /home/ulfw/download/oneTBB/src/tbb/task_dispatcher.h:351
#11 tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::external_waiter> (waiter=..., t=<optimized out>, this=0xffff95e07480) at /home/ulfw/download/oneTBB/src/tbb/task_dispatcher.h:459
#12 tbb::detail::r1::task_dispatcher::execute_and_wait (t=<optimized out>, wait_ctx=..., w_ctx=...) at /home/ulfw/download/oneTBB/src/tbb/task_dispatcher.cpp:168
#13 0x0000aaaad48043f0 in tbb::detail::d1::execute_and_wait (w_ctx=..., wait_ctx=..., t_ctx=..., t=...) at /home/ulfw/download/oneTBB/src/tbb/../../include/oneapi/tbb/detail/_task.h:191
#14 tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned short>, tbb::detail::d1::parallel_for_body_wrapper<TestFunctor<unsigned short>, unsigned short>, tbb::detail::d1::simple_partitioner const>::run (partitioner=..., context=..., body=<synthetic pointer>..., range=<synthetic pointer>...) at /home/ulfw/download/oneTBB/src/tbb/../../include/oneapi/tbb/parallel_for.h:112
#15 tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned short>, tbb::detail::d1::parallel_for_body_wrapper<TestFunctor<unsigned short>, unsigned short>, tbb::detail::d1::simple_partitioner const>::run (partitioner=..., body=<synthetic pointer>..., range=<synthetic pointer>...) at /home/ulfw/download/oneTBB/src/tbb/../../include/oneapi/tbb/parallel_for.h:101
#16 tbb::detail::d1::parallel_for<tbb::detail::d1::blocked_range<unsigned short>, tbb::detail::d1::parallel_for_body_wrapper<TestFunctor<unsigned short>, unsigned short> > (partitioner=..., body=<synthetic pointer>..., range=<synthetic pointer>...) at /home/ulfw/download/oneTBB/src/tbb/../../include/oneapi/tbb/parallel_for.h:237
#17 tbb::detail::d1::parallel_for_impl<unsigned short, TestFunctor<unsigned short>, tbb::detail::d1::simple_partitioner const> (last=1024, partitioner=..., f=..., step=221, first=0) at /home/ulfw/download/oneTBB/src/tbb/../../include/oneapi/tbb/parallel_for.h:314
#18 tbb::detail::d1::parallel_for_impl<unsigned short, TestFunctor<unsigned short>, tbb::detail::d1::simple_partitioner const> (last=1024, partitioner=..., f=..., step=221, first=0) at /home/ulfw/download/oneTBB/src/tbb/../../include/oneapi/tbb/parallel_for.h:306
#19 tbb::detail::d1::parallel_for<unsigned short, TestFunctor<unsigned short> > (partitioner=..., f=..., step=221, last=1024, first=0) at /home/ulfw/download/oneTBB/src/tbb/../../include/oneapi/tbb/parallel_for.h:328
#20 InvokerStep<parallel_tag, tbb::detail::d1::simple_partitioner const, unsigned short, TestFunctor<unsigned short> >::operator() (this=<synthetic pointer>, first=<synthetic pointer>: <optimized out>, last=<synthetic pointer>: 1024, step=<synthetic pointer>: <optimized out>, p=..., f=...) at /home/ulfw/download/oneTBB/test/conformance/conformance_parallel_for.cpp:148
#21 TestParallelForWithStepSupportHelper<parallel_tag, unsigned short, tbb::detail::d1::simple_partitioner const> (p=...) at /home/ulfw/download/oneTBB/test/conformance/conformance_parallel_for.cpp:210
#22 0x0000aaaad48051d0 in TestParallelForWithStepSupport<parallel_tag, unsigned short> () at /home/ulfw/download/oneTBB/src/tbb/../../include/oneapi/tbb/partitioner.h:629
#23 0x0000aaaad47eea4c in doctest::Context::run (this=<optimized out>) at /home/ulfw/download/oneTBB/test/common/doctest.h:7060
#24 0x0000aaaad47d361c in main (argc=<optimized out>, argv=<optimized out>) at /home/ulfw/download/oneTBB/test/common/doctest.h:7138

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants