Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically updated nixos channel pins #252057

Closed
wants to merge 1 commit into from

Conversation

infinisil
Copy link
Member

@infinisil infinisil commented Aug 29, 2023

Description of changes

Warning
A bit drafty, please only review the overall idea for now

This makes two changes:

  • lib.channel.latestKnownNixOSChannelInfo: An attribute set containing information on the latest known successful NixOS channel (either nixos-unstable on master or nixos-XX.YY on release-XX.YY):

    {
      rev = "a999c1cc0c9eb2095729d5aa03e0d8f7ed256780";
      sha256 = "178smvv8f8pashdjcr9bhmp0baji0lhfcxqy3cn7m19g8rgd6539";
    }
  • A GitHub Actions workflow to keep this updated whenever the channel gets updated:

    • Pushes to the nixos-unstable branch (only done when the channel updates) will cause it to create a PR to update the pin on the master branch.
    • Simliarly, pushes to the nixos-XX.YY branches will cause it to create PR's to the release-XX.YY branches.

    These PR's will have to be reviewed and merged manually.

Here's a successful run of the workflow in a fork which I simulated by pushing to the nixos-unstable branch, and it created this PR.

Motivation

For RFC 140 I would like to have the checking tool introduced in #250885 be run for every PR, to ensure the structure of pkgs/by-name stays correct. In order to make that possible, the tool will be always be pre-built on the nixos-* channels, allowing CI to run it without building anything. See here for more context.

While CI could just fetch from the unstable channel directly, this would make CI impure, giving potentially different results at different times for the same revision since the channel could be updated at any time.

By pinning the channel inside Nixpkgs itself, this isn't a problem anymore, because CI can use the Nixpkgs from the pinned version.

Things done

  • Verified that the workflow runs successfully in a fork, including for release branches.
  • Documentation

@github-actions github-actions bot added 6.topic: policy discussion 6.topic: lib The Nixpkgs function library labels Aug 29, 2023
@infinisil infinisil requested a review from a user August 29, 2023 00:29
@infinisil infinisil changed the title Pin channel Automatically updated nixos channel pins Aug 29, 2023
@infinisil infinisil added significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc. backport release-23.05 labels Aug 29, 2023
@infinisil infinisil requested review from zowoq, a team and Mic92 August 29, 2023 00:32
@Artturin
Copy link
Member

Artturin commented Aug 29, 2023

👎 on these kinds of automated commits which duplicate work (in this case channels) for not much benefit and clutter the history.

Will the program really be updated so often after the initial changes that this change is warranted?

While CI could just fetch from the unstable channel directly, this would make CI impure, giving potentially different results at different times for the same revision since the channel could be updated at any time.

To me using the channels sounds OK.
IIRC ofborg merges the PR to the target branch so this wouldn't be any worse (instead it would be better).

@infinisil
Copy link
Member Author

infinisil commented Aug 29, 2023

Yeah honestly I'd be okay with using the unstable channel for this too, it shouldn't be a big problem. @amjoseph-nixpkgs originally was worried about this, which prompted me to look into implementing this, and I think it's kind of neat and does have some benefit.

It does go into the direction of a much more beneficial change: Tracking the entire channel history in a git repository, replacing the (now-broken) https://channels.nix.gsc.io/. This would be really neat for pinning in general, you could e.g. say "give me nixos-unstable from 2023-08-23", all in Nix. But probably that should be in a separate repository anyways, not in Nixpkgs itself.

@tomberek
Copy link
Contributor

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/is-channels-nix-gsc-io-active-or-is-there-an-alternative/32303/7

@ghost
Copy link

ghost commented Oct 22, 2023

While CI could just fetch from the unstable channel directly, this would make CI impure, giving potentially different results at different times for the same revision since the channel could be updated at any time.

To me using the channels sounds OK.

The problem with that is that CI is no longer deterministic!

So you and I could run the same CI, at the same git checkout, and get different results. Tracking down why is a nightmare! Or you and OfBorg. It's really crucial that developers can reproduce the exact build that OfBorg performs, and always get the exact same result. They should be able to do this using only nix, the nixpkgs repo, and FODs whose hashes are in the repo.

I'm totally sympathetic to the "commit noise" problem. For the record, my original suggestion was to update the pointer only when staging is merged into master, as part of that merge since those are the mass-rebuild landings. They're also semi-manual and infrequent (rarely more than twice a month). You could even include the channel-bump change as part of the merge commit for zero commit noise.

Somewhere along the way my suggestion appears to have morphed into a grander scheme to have nixpkgs' git repo track channels.nixos.org, which is a nifty feature but not without downsides.

Anyways.

@infinisil
Copy link
Member Author

The problem with that is that CI is no longer deterministic!

Actually, recently I've noticed a very good reason for why CI shouldn't be deterministic in Nix: To merge things into master, CI should use the latest checks for master. If you use older checks, you risk breaking master once it's merged!

And indeed, this is also what GitHub's pull_request workflow event also does by default: It checks out the merge commit between the latest base branch commit and the PR head commit.

But still, you can still reasonably reproduce the CI result by looking at the CI logs, which will show you exactly what happened. It just won't be reproducible entirely in Nix.

@ghost
Copy link

ghost commented Oct 24, 2023

This was originally all hashed out on IRC about two months ago, and I failed to summarize it in a more stable medium. Below is my attempt to do so, since we're coming back to some of the same questions which were already addressed.

To merge things into master, CI should use the latest checks for master.

To clarify: "the latest checks" here refers to the binaries (mainly the rnix parser, which needs rustc) needed by the tests. Any tests written in nix code (like lib/tests) will of course continue to be taken from the commit to be merged.

The question is, where do we get these binaries from? Here are the options:

  1. Build them at HEAD (i.e. at the commit being merged)
  2. Build them at some commit pinned in nixpkgs which is updated automatically...
    a. with each merge of staging into master
    b. every time channels.nixos.org advances (i.e. hydra completes a build)
  3. Build them at whatever commit happens to be reported by some nondeterministic website, such as channels.nixos.org, when CI queries it

To merge things into master, CI should use the latest checks for master.

... by which I think you mean option (1) above:

To merge things into master, CI should use rnix built by rustc from master.

This is the ideal strategy if you have unlimited compute power.

When we discussed it a month or two ago, the main objection was that it would mean that OfBorg would need to build rustc in order to run CI, and people said that would make CI slow. I pointed out that for PRs against staging, OfBorg already re-bootstraps stdenv and cppnix for every PR, so we're sort of already paying that cost. So I think (1) is okay, but others weren't sure.

I proposed (2a) as a compromise. We're already well into the territory where anything that affects rustc has to go to staging anyways.

This was said to be too laggy (staging merges every ~two weeks-ish), so (2b) was proposed to give us fresher tooling. I'm fine with that too.

But still, you can still reasonably reproduce the CI result by looking at the CI logs, which will show you exactly what happened. It just won't be reproducible entirely in Nix.

This sounds more like (3) above, which I really have a problem with. Please don't do this. Determinism is precious. It's always tempting to say "oh, we can allow just this little bit of nondeterminism, it's too hard or too much work to be totally deterministic" but this stuff accumulates like a form of debt.

Please. It's really important that all the data needed to run CI are contained (perhaps by SRI-hash reference) within the nixpkgs repo and the commit being merged to it. Any additional data source you have to pull in to run CI makes running it more frustrating and less reliable.

@ghost
Copy link

ghost commented Oct 24, 2023

And indeed, this is also what GitHub's pull_request workflow event also does by default: It checks out the merge commit between the latest base branch commit and the PR head commit.

Woah, so if I click the "Rebase and Merge" button, which does not create a merge commit, CI for the next PR merged after that will ignore it? That sounds crazy.

@infinisil
Copy link
Member Author

I don't fully understand what you're saying, but assuming you want CI to be deterministic based on the HEAD commit (that is, for the same HEAD commit you always get the same CI result), then there's a simple problematic example:

  • 1 year ago I made a commit based on the then-latest master commit
  • Today I'm creating a PR with that commit
  • In order to be deterministic based on that commit, CI will have to use dependencies and checks from 1 year ago. Nothing about the commit changed in the year, so nothing about the CI result should've been able to change

And this is obviously problematic because the checks might be completely different now than they were 1 year ago. And since you can just choose the parent commit anyways, you don't even need to wait for a year. I could create a commit based on any of the >500'000 commits over the last 20 years, and it would have to use CI checks from that point.

So it makes no sense for it to be deterministic based on the HEAD commit, instead you also need some way for the base branch state to influence the CI result. You can use the latest base branch commit (that's what GitHub's pull_request does), or you can use a delayed but tested version of the latest base branch commit (that's what the channels do).

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

I don't know github's API well (or at all really) so hopefully we can put this thing into some kind of one-shot mode to try it out.

It looks like this just opens PRs, but humans have to merge them (for now). That sounds like a good idea to start with.

@@ -0,0 +1,4 @@
{ lib }:
{
latestKnownNixOSChannelInfo = lib.importJSON ./pin.json;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
latestKnownNixOSChannelInfo = lib.importJSON ./pin.json;
# This pin is updated to point to the most recent completely-successful
# build of the Hydra channel which has the maximal set of pnames (i.e.
# advances most conservatively). Currently that Hydra channel is named
# "nixos-unstable".
latestKnownHydraChannelInfo = lib.importJSON ./pin.json;

I still remember how tremendously confusing it was to me that the nixpkgs-unstable channel actually tracks darwin, whereas the nixos channel is the one you want to follow if you use linux even if you don't use nixos.

Obviously changing the channel names is not on the table at this point, but maybe we can try to limit the propagation of this terribly-confusing naming scheme?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the nixpkgs-unstable channel actually tracks darwin

Huh I don't think so? I don't see anything Darwin-specific for that channel anywhere in the Hydra setup.

Comment on lines +7 to +8
# Any release branches like nixos-23.05
- 'nixos-[0-9][0-9].[0-9][0-9]'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Any release branches like nixos-23.05
- 'nixos-[0-9][0-9].[0-9][0-9]'
# Any release branches like nixos-23.05
- 'nixos-[0-9][0-9].[0-9][0-9]'
# There is code below that assumes any branch not named `nixos-unstable`
# matches the pattern above; be sure to update that code if you add to this list.

runs-on: ubuntu-latest
steps:
- uses: cachix/install-nix-action@v22
- name: Compute development branch
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: Compute development branch
- name: Compute name of base branch for newly-created PR

branch=release${GITHUB_REF_NAME#nixos}
fi
echo "branch=$branch" >> "$GITHUB_OUTPUT"
- name: Check out development branch
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: Check out development branch
- name: Check out base branch for newly-created PR

uses: actions/checkout@v3
with:
ref: ${{ steps.dev-branch.outputs.branch }}
- name: Update pin
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: Update pin
- name: Create commit and prepare PR which will update the pin

echo "pr_title=$pr_title" >> "$GITHUB_OUTPUT"
echo "pr_body_path=$pr_body_path" >> "$GITHUB_OUTPUT"
fi
- name: Create Pull Request
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: Create Pull Request
- name: Open the Pull Request prepared in the previous step

newRev=$GITHUB_SHA
pinFile=lib/channel/pin.json

echo "Fetching new revision $newRev"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo "Fetching new revision $newRev"
echo "Fetching tarball for new git revision $newRev of nixpkgs"

change_url="$GITHUB_SERVER_URL"/"$GITHUB_REPOSITORY"/compare/"$oldRev".."$newRev"

echo "Checking if anything other than $pinFile changed between $oldRev and $newRev"
# Only don't make a PR if only the pin file changed, not if it was added/removed
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Only don't make a PR if only the pin file changed, not if it was added/removed
# Don't make a PR if the pinfile already existed and has changed but nothing else has changed.

echo "Nothing changed, no PR to update the pin necessary"
create_pr=
else
echo "The channel changed, PR to update the pin is necessary"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo "The channel changed, PR to update the pin is necessary"
echo "The channel changed, and there have been non-channel-pin changes, so a PR to update the pin is necessary"

@ghost
Copy link

ghost commented Nov 7, 2023

I don't fully understand what you're saying, but assuming you want CI to be deterministic based on the HEAD commit (that is, for the same HEAD commit you always get the same CI result),

Yep, you understood exactly what I was saying.

* 1 year ago I made a commit based on the then-latest master commit
* Today I'm creating a PR with that commit
* In order to be deterministic based on that commit, CI will have to use dependencies and checks from 1 year ago.

... or simply refuse to start CI at all if the PR's base commit is too old.

There are already sanity checks made before even attempting to run CI (like how the by-name checks don't even try to run if you have a merge conflict). Deciding to refuse to even attempt to begin running CI doesn't need to be deterministic, of course. OfBorg could be offline, after all!

So, to clarify, you can decide that it's not worth attempting CI for any number of nondeterministic reasons (merge conflict, OfBorg is offline, unpaid bills, intergalactic warfare, whatever) but once you attempt CI that attempt should be deterministically reproducible.

A simple "is the PR's base commit too old" check easily accomplishes this.

Side note: this problem with base commits is an artifact of github's wonky pull-request worldview. In a merge-queue setup like most Gerrit installs use this problem can't occur (and you can't cheat your way around CI) because each commit is rebased and CI'd separately before merging.

@infinisil
Copy link
Member Author

infinisil commented Nov 7, 2023

A simple "is the PR's base commit too old" check easily accomplishes this.

I'm not convinced. The amount of time you set as "too old" also creates a trade-off between:

  • How often people need to rebase active PRs
  • How quickly new checks get enforced

If you set this to 1 week, it means that I'd have to rebase my WIP PR's every week to even get CI to run (even if nothing changed about the checks!). And it also means that for an entire week, people can still make new PR's that don't use the new checks. Both of these suck.

In comparison, if we use the latest base branch, we don't need to make that trade-off, we can have the best of both worlds.

@ghost
Copy link

ghost commented Nov 11, 2023

In comparison, if we use the latest base branch, we don't need to make that trade-off, we can have the best of both worlds.

And nondeterministic, unreproducible tests.

If you set this to 1 week, it means that I'd have to rebase my WIP PR's every week to even get CI to run (even if nothing changed about the checks!).

Yes, you should always rebase your PR if you want to run CI again. This is how it ought to work.

people can still make new PR's that don't use the new checks.

Nope, they'll always use the latest test vectors, since those are in-tree. The only thing that's a week-old is the binaries for the tests. Since nobody's stupid enough to compile the test vectors into the binaries, this isn't a problem.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-11-28-nixpkgs-architecture-team-meeting-46/36171/1

@infinisil
Copy link
Member Author

With #281374, the pkgs/by-name check doesn't use fetch the tool impurely from the channel anymore, and instead uses a pinned version that needs to be updated separately. I'll close this PR for now as that was the main motivation and should address the concerns here.

@infinisil infinisil closed this Jan 17, 2024
@infinisil infinisil deleted the channel-pin.yml branch January 17, 2024 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: lib The Nixpkgs function library 6.topic: policy discussion 10.rebuild-darwin: 0 10.rebuild-linux: 1-10 significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants