Skip to content
This repository has been archived by the owner on Aug 17, 2022. It is now read-only.

relocation truncated to fit: R_RISCV_RVC_LUI #144

Closed
ldoolitt opened this issue Apr 5, 2018 · 5 comments
Closed

relocation truncated to fit: R_RISCV_RVC_LUI #144

ldoolitt opened this issue Apr 5, 2018 · 5 comments

Comments

@ldoolitt
Copy link

ldoolitt commented Apr 5, 2018

This error was observed in real life, a typical final error signature looks like
timer.c:(.text+0xa): relocation truncated to fit: R_RISCV_RVC_LUI against symbol 'timerStartValue' defined in .sbss section in ../../firmware/src/timer.o /opt/riscv32imc/lib/gcc/riscv32-unknown-elf/7.2.0/../../../../riscv32-unknown-elf/bin/ld: final link failed: Symbol needs debug section which does not exist collect2: error: ld returned 1 exit status
Previously observed in a gcc issue, where lz-sc provided a patch.
I confirmed the problem still exists as of commit 1b80cbe of riscv-gnu-toolchain-rv32i,
and that a re-based version of lz-sc's patch still fixes it. My gihub-fu failed me, so it's not attached here; you can get it from my server.
We've started reducing the test case, but it's still kind of a mess and I'm not comfortable posting it here yet, sorry.

@jim-wilson
Copy link
Collaborator

lz-sc's patch is flawed. Because we are using lui/addi to load 32-bit constants, and addi is signed, the value used for lui has to be rounded to the nearest multiple of 0x1000. So for instance to load 0x1ffff, the lui loads 0x2000 and the addi subtracts 1. This is why the current code uses RISCV_CONST_HIGH_PART to adjust the constant value before checking if it is valid. lz-sc's patch does not do this correction, and hence will reject some valid values for c.lui.

However, there does appear to be an unfixed problem here, as the bug reported in riscvarchive/riscv-gcc#97 is for a c.lui with a 0 immediate, but the fix linked to it is for handling an x0 destination. So it appears the original bug may not have been fixed. But since lz-sc's patch is flawed, that does not look like the correct fix.

Unfortunately, it isn't clear what the actual bug is here. I'd need a testcase that I can use to reproduce it. The gcc issue doesn't have a complete testcase, and you haven't provided one yet either. There could be something funny happening during relaxation that the current code does not account for.

You might try using the FSF binutils sources. The riscv-binutils-gdb sources are out-of-date, and a number of problems have already been fixed in the upstream FSF binutils tree. Also, the FSF binutils tree will make this problem a little easier to debug, as you can add -Wl,--noinhibit-exec to the gcc link command, and then use objdump on the resulting executable to see what the symbol values are and what the final instruction is.

@ldoolitt
Copy link
Author

ldoolitt commented Apr 5, 2018

I got a test case reduced to the point where it doesn't have too much distracting cruft. Attached.

linker_error_test4.tar.gz

@jim-wilson
Copy link
Collaborator

I see the problem. Your linker script has a single memory section, so text+data ends up in a single segment in the executable. Normally, text and data are separate segments, and when we do relaxation on text, that does not affect addresses in the data section. But with text and data in the same segment, relaxation causes addresses in the data section to decrease and that breaks assumptions in the riscv port. In this particular case, the original value for the symbol is 0x800 which gives a valid c.lui of 0x1. Then relaxation decreases text size, and the final value for the symbol is 0x694 which gives an invalid c.lui of 0.0.

Unfortunately, the code size decrease from relaxation is potentially unbounded, so there is no easy way to calculate any useful value here. The only good solutions are to require that text and data are in separate segments, or to disable relaxation.

The code already disables relaxation for text symbols so that this problem can not occur for them. At the top of _bfd_riscv_relax_lui is
/* Mergeable symbols and code might later move out of range. */
if (sym_sec->flags & (SEC_MERGE | SEC_CODE))
return TRUE;
If we wanted to do the same for data symbols, then this would effectively be
if (1)
return TRUE;
and we wouldn't have any lui relaxation anymore which would be undesirable.

@jim-wilson
Copy link
Collaborator

FYI you can turn off linker relaxation from gcc with -Wl,--no-relax.

@ldoolitt
Copy link
Author

ldoolitt commented Apr 6, 2018

OK, that makes sense. Thanks for looking into it. We clearly need to think through our section ordering more carefully.

Here's an archival copy of the (useless?) patch I posted above, in case my server ever goes dark:
elfnn-riscv-7615ca1.txt

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants