Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency detected by ld.so: dl-lookup.c: 966: _dl_setup_hash: Assertion `(bitmask_nwords & (bitmask_nwords - 1)) == 0' failed! #368

Closed
ghost opened this issue Mar 3, 2022 · 5 comments · Fixed by #380
Labels

Comments

@ghost
Copy link

ghost commented Mar 3, 2022

Describe the bug

During the nixpkgs bootstrap process, binaries which link against a patchelf'ed librt.so.1 abort with:

Inconsistency detected by ld.so: dl-lookup.c: 966: _dl_setup_hash: Assertion `(bitmask_nwords & (bitmask_nwords - 1)) == 0' failed! #368 

Steps To Reproduce

  1. Set your NIX_PATH explicitly, if you have not yet done so (i.e. such that $NIX_PATH/nixpkgs/ points to a git checkout of nixpkgs).

  2. Execute these commands to build the mips64el bootstrap tools, mips64el-nix, and launch a mips64el qemu VM:

# Assumes you have set $NIX_PATH explicitly.  If not, do so.

cd $NIX_PATH/nixpkgs

git fetch https://github.com/a-m-joseph/nixpkgs patchelf-issue-368
git switch -c patchelf-issue-368 FETCH_HEAD

cd /tmp  # or elsewhere
git clone -b patchelf-issue https://github.com/a-m-joseph/mips64-nixpkgs-qemu
cd mips64-nixpkgs-qemu
nix-shell shell.nix --argstr NIXPATH $NIX_PATH

Once the VM boots to a root shell, paste this command:

./demo.sh

You should get:

Inconsistency detected by ld.so: dl-lookup.c: 966: _dl_setup_hash: Assertion `(bitmask_nwords & (bitmask_nwords - 1)) == 0' failed!
error: builder for '/nix/store/76h8qjiv0kr99n87g01qz3sq4z5svikv-bootstrap-tools.drv' failed with exit code 127

This is due to the attempt to patchelf --set-rpath librt.so; if you don't patchelf that, (but instead patchelf other libraries, as I do here), you won't get this error. Note that librt.so has no RPATH prior to patchelf'ing. I suspect that the problem here is some corner case involving --set-rpath on a library that did not have one to begin with.

Expected behavior

The bootstrap completes.

patchelf --version output

# LD_LIBRARY_PATH=lib lib/ld.so.1 ./patchelf  --version
patchelf 0.14.5

Additional context

The last commit in the nixpkgs repo used above is just a bit of paranoia; you can drop that commit and you'll still get the same result.

I discussed this bug here but I'm no longer sure it is the same problem the person who opened that bug was having. In order to avoid hijacking their bug I am opening a separate one now that I have a simple way to reproduce the issue.

@ghost
Copy link
Author

ghost commented Mar 3, 2022

Note: if you get a build failure in libredirect, set doInstallCheck=false; in pkgs/build-support/libredirect/libredirect.nix. You might also need to set doCheck=false in openssh/common.nix. Other than these two changes it should build cleanly.

I think these two problems are a result of me just picking a bad day to rebase against master.

@ghost
Copy link
Author

ghost commented Apr 22, 2022

This problem also occurs if you try to patchelf --set-interpreter either libc.so or libpthread.so. Fortunately there really is no need for either of these libraries to have an ELF interpreter (unlike ld.so they are not dual-function library-binaries).

@ghost
Copy link
Author

ghost commented Jun 19, 2022

I've been attempting to make further progress on this.

Strangely, mips bootstrap-files built at the current nixpkgs HEAD no longer experience this bug. Something changed in glibc between version 2.33 and 2.34.

Edit: the guess below was incorrect My best guess is the major overhaul of how they handle thread-local storage.

A diff -u <(readelf -a ...) <(readelf -a ...) on the before/after librt.so shows that glibc no longer has the STATIC_TLS flag in the dynamic section:

- 0x000000000000001e (FLAGS)              BIND_NOW STATIC_TLS
+ 0x000000000000001e (FLAGS)              BIND_NOW

And errno@GLIBC_PRIVATE no longer has a weird MIPS-specific relocation type in .rel.dyn:

-Relocation section '.rel.dyn' at offset 0x2488 contains 5 entries:
+Relocation section '.rel.dyn' at offset 0xae0 contains 2 entries:
...
-0000000183a8  000800000030 R_MIPS_TLS_TPREL6 0000000000000000 errno@GLIBC_PRIVATE
-                    Type2: R_MIPS_NONE
-                    Type3: R_MIPS_NONE

I'll keep working on this.

@ghost
Copy link
Author

ghost commented Jun 19, 2022

Edit: the guess below was incorrect More evidence supporting my suspicion: in glibc-2.34:
  • librt.so.1 no longer has the STATIC_TLS flag or any R_MIPS_TLS_TPREL6 relocations
  • libc.so.6 now has the STATIC_TLS flag and some R_MIPS_TLS_TPREL6 relocations.
  • if I adjust nixpkgs' unpack-bootstrap-files.sh script so it patchelfs libc.so.6 (in addition to librt.so.1), the crashing comes back

I strongly suspect that patchelf is mishandling some aspect of STATIC_TLS, most likely the R_MIPS_TLS_TPREL6 relocation.

@ghost
Copy link
Author

ghost commented Jun 19, 2022

Woo hoo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

0 participants