-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: out of bounds panic when diffing large number of words #695
base: master
Are you sure you want to change the base?
Conversation
f4161f2
to
b686f58
Compare
I ran a small script to compare the output of db281c6 with my branch to find the cause of regression, it's all the trailing newlines that were not being printed before. |
Thanks for the pull request, and the clear explanation! This sounds like a regression from 8c004be. |
You're welcome.
I'm not sure if this is the cause of the regression because unless I'm missing something, the changes in this commit are just inlining the content of @Wilfred Do you have any plans to merge my changes in the near future? If so, I can update the regression file so the pipeline succeeds. If you don't want to merge, I can use the examples provided in the reported issues to bisect which commit caused the regression and bring it to the discussion. |
What's the status of this PR? I'm running into this issue too. |
Idk, everything is fine from my side. It depends on how @Wilfred wants to go from there. I've been personally just building difftastic with my patches on top since without them difft would crash 90% of the time for my use-case. |
This approach removes the additional empty line that would cause the display logic to disagree in line count. The major difference is that this new approach does not change the output of the display implementation, therefore, no test files need to be updated, and no additional line is printed to the end-user. As for the cause of the index out-of-bounds, this is simply a disagreement between line sources, such as the `line-numbers` crate and Tree Sitter, which all reports the correct number of lines, with the `str::lines` iterator which does not iterate over the last new line character if the last line is empty.
017381e
to
d6c22b8
Compare
This is caused by the same problem as before, but the disagreement seems to be with the Tree itself and happens inside `syntax::change_positions`, there may be another way to work around this by removing the correct entry from `rhs_positions` when the conditions are met.
d6c22b8
to
ab39730
Compare
I noticed that #731 does the same thing as I did (but slightly differently), which is to ignore the last empty line, and it will fix most of the out-of-bounds panics except for the yaml one, which is caused by the same problem, but in a different location (mine "fixes" this one by skipping out of bounds indexes). |
I dropped all the old text because it does not make sense anymore.
The problem is basically that Rust's
str::lines()
will return the same number of lines regardless of the last character being a new line or not, and the assumptions made on inline and side-by-side display implementations depends on having a clear distinction of those cases.Those changes introduces a function that will chain an additional element when the string ends with a newline. I'm not aware of any function from stdlib that does what we need (and I couldn't find one by looking at the documentation), the closest one wassplit
, but it has obvious implications (\r\n
, for example).This does solves #688, #694, #682 and #681, I tested all the provided files and they're all working without any panics.
Update 2024-07-08:
I reworked on this patch and now it does not add a new line to the end of the string, therefore, all tests are passing, including the regression ones (which was failing before). In this new revision it fixes #682 again, which involves
--display=inline
and was not working anymore when I initially merged the changes from master into my branch.I also tested another case that I found out and it fixes #656.
The new approach is more hacky: in one case it just removes the additional line range when the line is empty, and in another case it just checks if the index is within range before trying to access the index. However, unlike the previous solution, this one does not fail the regression tests.