Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.lock files incorrectly block the build process #10897

Closed
chayleaf opened this issue Jun 12, 2024 · 3 comments
Closed

.lock files incorrectly block the build process #10897

chayleaf opened this issue Jun 12, 2024 · 3 comments
Labels
bug protocol Things involving the daemon protocol & compatibility issues

Comments

@chayleaf
Copy link

chayleaf commented Jun 12, 2024

Describe the bug

I only have this issue on one of my Arm machines that I use as a remote builder.

Whenever I'm building something, at times Nix will print copying 0 paths.... When that happens, the build will stall until I execute the following command on the remote builder:

to_remove="$(find /persist/nix/store/*.lock -mtime 0)"
echo "Removing $to_remove..."
rm $to_remove

(/persist is the path that backs impermanence; I have to use that path here because /nix is read-only)

This is the case in Nix 2.18, and has been the case on prior Nix versions as well, since at least November (perhaps it has been an issue long before November).

Steps To Reproduce

It's unclear what exactly causes the issue, but in my specific conditions it happens 100% of the time when "copying 0 paths..." is printed.

Expected behavior

Build continuing as usual.

nix-env --version output

nix-env (Nix) 2.18.2

Additional context

Perhaps the fact I'm using bcachefs with no ACLs could cause the issue here? I have no idea. I'm willing to work on fixing this (in fact I suppose it's a necessity given nobody else complained about it), but I have no idea where to begin.

Priorities

Add 👍 to issues you find important.

@chayleaf chayleaf added the bug label Jun 12, 2024
@roberth
Copy link
Member

roberth commented Jun 17, 2024

Do you have remote builders for your remote builders? If not, we can rule out #10740.
Could you try with -vvvvv?
If that does not reveal a potential cause, could you attach GDB and print stack traces for the client and the remote nix-daemon's worker process? You can find the latter's pid in the process tree under an sshd process. Directly under ssh you might only find a dumb proxy, in which case we'll probably need stack traces from the corresponding nix-daemon worker process. Those are started with the client's pid as an argument, for this purpose, of correlating them when debugging.

@chayleaf
Copy link
Author

chayleaf commented Jun 17, 2024

I do have remote builders for my remote builders. The scheme is as follows: x86_64 workstation <-> aarch64 server, the server uses my workstation for x86_64 build jobs, the workstation uses the server for aarch64 build jobs. Should this issue be closed in favor of #10740?

@roberth roberth added the protocol Things involving the daemon protocol & compatibility issues label Jul 5, 2024
@roberth
Copy link
Member

roberth commented Jul 5, 2024

Should this issue be closed in favor of #10740?

Yeah, that seems to be the same underlying issue then.
Thanks for confirming!

You could subscribe to that issue if you haven't already.

@roberth roberth closed this as not planned Won't fix, can't repro, duplicate, stale Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug protocol Things involving the daemon protocol & compatibility issues
Projects
None yet
Development

No branches or pull requests

2 participants