Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gazebo fails to build for x86_64 on AArch64 with QEMU user-mode emulation through binfmt_mist #296

Open
hacker1024 opened this issue Aug 31, 2023 · 3 comments

Comments

@hacker1024
Copy link
Contributor

The Gazebo build process fails due to a segmentation fault in an automatic MOC step when using QEMU user-mode emulation for x86_64 on AArch64.

This is most likely not a problem caused by this overlay, but maybe we can come up with something to add to fix this. I have no idea where to even start debugging, though - it's especially painful due to the speed of emulation.

To reproduce

  1. On an AArch64 machine, enable QEMU user-mode emulation for x86_64 through binfmt_misc (e.g. by adding x86_64-linux to boot.binfmt.emulatedSystems on NixOS)
  2. Build Gazebo (add --check if it already exists)
$ nix-build --system x86_64-linux \
    -I nix-ros-overlay=https://github.com/lopsided98/nix-ros-overlay/archive/develop.tar.gz \
    --extra-substituters 'https://ros.cachix.org' \
    --extra-trusted-public-keys 'ros.cachix.org-1:dSyZxI8geDCJrwgvCOHDoAfOm5sV1wCPjBkKL+38Rvo=' \
    '<nix-ros-overlay>' -A gazebo
  1. Note that it fails
[  3%] Automatic MOC for target gzqtpropertybrowser
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
make[2]: *** [gazebo/gui/qtpropertybrowser/CMakeFiles/gzqtpropertybrowser_autogen.dir/build.make:71: gazebo/gui/qtpropertybrowser/CMakeFiles/gzqtpropertybrowser_autogen] Segmentation fault (core dumped)
make[1]: *** [CMakeFiles/Makefile2:3866: gazebo/gui/qtpropertybrowser/CMakeFiles/gzqtpropertybrowser_autogen.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
@wentasah
Copy link
Contributor

wentasah commented Sep 3, 2023

Can you get a backtrace of the crash? The easiest thing is IMHO coredumpctl dump or coredumpctl debug and then bt.

Which hardware do you run this on? Can little RAM be the reason?

@hacker1024
Copy link
Contributor Author

I have noticed the same issue in rqt-gui-cpp, so I will be using that for debugging going forward as it is much faster to build.

The backtraces are copied below. I am using a fairly powerful free-tier Oracle Cloud ARM VM - 4 ARM cores / 24GB RAM / 200GB SSD, so I don't think there are any issues there.

           PID: 4109482 (qemu-x86_64)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Sun 2023-10-01 21:29:51 AEDT (5min ago)
  Command Line: /nix/store/0ygi905vqpqq0qxnxphwaj922f0r4y7w-qemu-8.0.3/bin/qemu-x86_64 -0 /nix/store/0dv0ylafnx7cdajyv9ahbpqrniblixq1-cmake-3.26.4/bin/cmake -E LD_LIBRARY_PATH=/nix/store/f407ln9qba45aqf4223lfq1gnkb41cqs-ros-humble-rosgraph-msgs-1.2.1-r1/lib:/nix/store/gwfjhrxngsshnw6b338jnv1f402c3s1d-ros-humble-std-msgs-4.2.3-r1/lib:/nix/store/y7ka21zj52cks7qkfbli1wszswdcjzpg-ros-humble-statistics-msgs-1.2.1-r1/lib:/nix/store/ivlj5ayya3a6fwpphdyx44sh3a7jbxij-ros-humble-tracetools-4.1.1-r1/lib:/nix/store/j5b2i6pxdpaqq7si806dk3n66fzfwksn-ros-humble-rmw-fastrtps-shared-cpp-6.2.4-r1/lib:/nix/store/ry7sxn7qans86j8vhnpya2m3a1p90g7q-ros-humble-rmw-dds-common-1.6.0-r2/lib:/nix/store/5q4afl4ag8xcasss46kq1xgbqmgamf90-ros-humble-rmw-fastrtps-cpp-6.2.4-r1/lib:/nix/store/6p5hi0fk7mknlnm9038dhfcy3w1yff6v-ros-humble-libyaml-vendor-1.2.2-r2/lib:/nix/store/k1k1wkkqkibv0h2i98r6x5pf6lkr8g3z-ros-humble-rcl-yaml-param-parser-5.3.5-r1/lib:/nix/store/jwhgbbxd869m0y9ylaf8p4g5jvcm90q3-ros-humble-rcl-logging-spdlog-2.3.1-r1/lib:/nix/store/04jqsc89jfniy3cmdgzd96262wjv8ych-ros-humble-rcl-interfaces-1.2.1-r1/lib:/nix/store/xnzz6fvil92lz8gl7ad0yrxn79c01cy4-ros-humble-rcl-5.3.5-r1/lib:/nix/store/8kg0l72hz5bxzjxhbz3ixfyvvygar2d1-ros-humble-libstatistics-collector-1.3.1-r1/lib:/nix/store/m4p2fig81nm03xqyw1plvfnxyy5nk12m-ros-humble-rosidl-typesupport-introspection-cpp-3.1.5-r2/lib:/nix/store/5n2fyvrx5jl61n4cixvixrnhwlah8vcw-ros-humble-rosidl-typesupport-introspection-c-3.1.5-r2/lib:/nix/store/n6xz4y5dp94m25f94yjwb48xbmqwz11y-ros-humble-rosidl-typesupport-fastrtps-cpp-2.2.1-r1/lib:/nix/store/99shz2bmkcc9qzdmm5k3fs60k53jiw4i-ros-humble-rosidl-typesupport-fastrtps-c-2.2.1-r1/lib:/nix/store/xp4k7xg7121wv1y9api2dq28bv36wc0v-ros-humble-rosidl-typesupport-cpp-2.0.1-r1/lib:/nix/store/m5vm2ha7hnznyjlcj50ddhs2s4k6d3vb-ros-humble-rosidl-typesupport-c-2.0.1-r1/lib:/nix/store/lrdpzfs0g1930kgxcnrjncxaday2ydv6-ros-humble-rosidl-runtime-c-3.1.5-r2/lib:/nix/store/i76pz00l4fk0qccdr0afglw5i4y2k1vr-ros-humble-rmw-6.1.1-r1/lib:/nix/store/ldvmjmdysp0jxhv6raxrqcdkrdmm360w-ros-humble-builtin-interfaces-1.2.1-r1/lib:/nix/store/5ybrr558di26wxjgxd6js60slwbmiag7-ros-humble-rclcpp-16.0.6-r1/lib:/nix/store/bmf0p235z9mmvh8wdj2cwd1lpk7k20ml-ros-humble-qt-gui-cpp-2.2.2-r1/lib:/nix/store/gx3cbxsw10zg9lgy5kwbr9wa51m9fz5r-ros-humble-rcutils-5.1.3-r1/lib:/nix/store/hv09h2qxnd41bh3hk96za7i7dh4qwbzi-ros-humble-rcpputils-2.4.1-r1/lib:/nix/store/vn4ldak5gjhdjsph79ldz82xpl1xysyy-ros-humble-ament-index-cpp-1.4.0-r2/lib -- /nix/store/0dv0ylafnx7cdajyv9ahbpqrniblixq1-cmake-3.26.4/bin/cmake -E cmake_autogen /root/Downloads/rqt-release-release-humble-rqt_gui_cpp-1.1.5-2/build/CMakeFiles/rqt_gui_cpp_autogen.dir/AutogenInfo.json Release
    Executable: /nix/store/0ygi905vqpqq0qxnxphwaj922f0r4y7w-qemu-8.0.3/bin/qemu-x86_64
 Control Group: /user.slice/user-0.slice/session-16.scope
          Unit: session-16.scope
         Slice: user-0.slice
       Session: 16
     Owner UID: 0 (root)
       Boot ID: f59e8f7abfd14e77be8cb2689cffc2b0
    Machine ID: 798219257e90491ab95def1889396768
      Hostname: ci
       Storage: /var/lib/systemd/coredump/core.qemu-x86_64.0.f59e8f7abfd14e77be8cb2689cffc2b0.4109482.1696156191000000.zst (present)
  Size on Disk: 1.9M
       Message: Process 4109482 (qemu-x86_64) of user 0 dumped core.
                
                Module libpcre2-8.so.0 without build-id.
                Module libgcc_s.so.1 without build-id.
                Module libnuma.so.1 without build-id.
                Module libcapstone.so.4 without build-id.
                Module qemu-x86_64 without build-id.
                Stack trace of thread 4109482:
                #0  0x0000ffffad46a278 __sigsuspend (libc.so.6 + 0x3a278)
                #1  0x0000aaaac0c7314c dump_core_and_abort (qemu-x86_64 + 0x10314c)
                #2  0x0000aaaac0c73600 handle_pending_signal (qemu-x86_64 + 0x103600)
                #3  0x0000aaaac0c75048 process_pending_signals (qemu-x86_64 + 0x105048)
                #4  0x0000aaaac0bb3968 cpu_loop (qemu-x86_64 + 0x43968)
                #5  0x0000aaaac0baf83c main (qemu-x86_64 + 0x3f83c)
                #6  0x0000ffffad456e40 __libc_start_call_main (libc.so.6 + 0x26e40)
                #7  0x0000ffffad456f18 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x26f18)
                #8  0x0000aaaac0bafeb0 _start (qemu-x86_64 + 0x3feb0)
                
                Stack trace of thread 4109484:
                #0  0x0000ffffad512be4 syscall (libc.so.6 + 0xe2be4)
                #1  0x0000aaaac0cc2898 qemu_event_wait (qemu-x86_64 + 0x152898)
                #2  0x0000aaaac0ccad0c call_rcu_thread (qemu-x86_64 + 0x15ad0c)
                #3  0x0000aaaac0cc1528 qemu_thread_start (qemu-x86_64 + 0x151528)
                #4  0x0000ffffad4ae630 start_thread (libc.so.6 + 0x7e630)
                #5  0x0000ffffad516edc thread_start (libc.so.6 + 0xe6edc)
                
                Stack trace of thread 4109485:
                #0  0x0000aaaac0bb2c84 safe_syscall_base (qemu-x86_64 + 0x42c84)
                #1  0x0000aaaac0c8df50 do_syscall1.constprop.0 (qemu-x86_64 + 0x11df50)
                #2  0x0000aaaac0c8ef3c do_syscall (qemu-x86_64 + 0x11ef3c)
                #3  0x0000aaaac0bb3ac0 cpu_loop (qemu-x86_64 + 0x43ac0)
                #4  0x0000aaaac0c7e328 clone_func (qemu-x86_64 + 0x10e328)
                #5  0x0000ffffad4ae630 start_thread (libc.so.6 + 0x7e630)
                #6  0x0000ffffad516edc thread_start (libc.so.6 + 0xe6edc)
                ELF object binary architecture: AARCH64

@hacker1024
Copy link
Contributor Author

The actual command that crashes QEMU is:

cmake -E cmake_autogen /path/to/build/dir/rqt-release-release-humble-rqt_gui_cpp-1.1.5-2/build/CMakeFiles/rqt_gui_cpp_autogen.dir/AutogenInfo.json Release

https://gitlab.com/qemu-project/qemu/-/issues/1502 may be related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants