Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0156] No Direct Nixpkgs Pushes #156

Closed
wants to merge 4 commits into from

Conversation

infinisil
Copy link
Member

Require pull requests for all Nixpkgs commits. A slimmer and updated version of #79.

Rendered

[^1]: Unix epoch 1658361600 to 1689897600

To determine whether a commit was pushed directly, the GitHub API was queried for pull requests associated with that commit (see [`associatedPullRequests`](https://docs.github.com/en/graphql/reference/objects#commit)).
If this list includes a merged pull request to the Nixpkgs master branch, the commit is known to be merged with a pull request.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a remark here that GitHub does allow merging PRs via a direct push of a clean-merge commit. (I cannot figure out from the documentation if it allows pushing an unclean merge, which would be an issue if we took «malicious code» part seriously)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will test this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if there are any merge conflicts, then the merge commit may contain arbitrary changes in order to fix that commit. Not sure if this is prevented on PRs with merge conflicts or if this could be prevented.

- It had a mistaken estimate for the percentage of direct master commits, calculating it to be 46.85% in the last year.
The mistake was assuming that all non-merge commits were direct pushes.
This made it seem like the change was much more impactful than it actually would've been.
- It was too ambitious by also proposing to require accepting reviews for all pull requests.
Copy link
Member

@7c6f434c 7c6f434c Jul 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think another important point is that in the years since #79 the discussions lead to various workflows to migrate gradually away from direct pushes. The old RFC proposed a step change unaligned with then-current practice (also it wanted reviews but whatever), the new RFC proposes to finalise a process transition that has had time to be almost completely implemented in practice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes a bit of time, but I can also check how many commits were directly pushed to master back then, I wouldn't be surprised if it was already very low.

Copy link
Member

@7c6f434c 7c6f434c Jul 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was pretty low for sure, but noticeably higher — and the updates to staging workflows in the meantime probably matter.

Ah, maybe I can put it like that: back then, there were some true positives in the direct push tracking action — comparable amounts to false positives, nowadays this doesn't seem to happen at all.

Copy link
Contributor

@asymmetric asymmetric Jul 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@7c6f434c are you saying that customs changed between #79 and now? (If so I would be curious why). Or was there something else that makes the change more palatable now than it was back then? (other than the review requirement)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Back then, the majority of changes already went through a PR at some point on their way to the main branch. For the exceptions, there were a few years of advertising specific changes to the specific workflows while talking up the virtues of ofBorg eval check until everyone either agrees or is nagged into compliance; apparently after all the workflow work the strong arguments for direct pushes to the channel-feeding branches are resolved.

This is the difference between «now» and «the current RFC transplanted back in time with the data re-processed according to the moment of submission». «RFC 79, try again» would hopefully fail again, because of the other demands it also had.

Copy link
Member Author

@infinisil infinisil Jul 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the results, out of a total 50457 master commits in the last year at RFC 79 start time 1, at most 2682 of them (5.315%) were direct pushes to master, complete listing here (not counting the 153 obvious false positives from @web-flow, see #156 (comment)).

So while it was still about 10 times less than the original estimate in the RFC (46.85%), it was 100 times more than now (0.0517%)! I will update the RFC with this information.

Footnotes

  1. Unix epoch 1569369600 to 1600992000

To determine whether a commit was pushed directly, the GitHub API was queried for pull requests associated with that commit (see [`associatedPullRequests`](https://docs.github.com/en/graphql/reference/objects#commit)).
If this list includes a merged pull request to the Nixpkgs master branch, the commit is known to be merged with a pull request.
Otherwise the commit could be directly pushed or be a false positives, which is why the above count is only an upper bound.
All obvious false positives ([example](https://github.com/NixOS/nixpkgs/commit/b09d18903c24b8aca88100df86aa2fdd5f05dfcd)) have been removed from this count and listing already.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, have you kept count how many obvious false positives you had to remove? To estimate how many non-obvious false positives we probably have remaining…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were 79 obvious false positive, which after another look are all from seemingly buggy merges using GitHub's web interface. Here's a complete listing:

Complete listing of false positives

Though I don't think this needs to be part of the RFC because it doesn't change any arguments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if you just add 79 to show that clear false positives are more numerours than «maybe, who knows», it does strengthen the argument about rarity of true directy pushes and their removal changing nothing at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't think it matters though, it's all just caused by imperfect ways to get at the right data (my script and GitHub bugs in this case). These false positives have no influence on any arguments and in fact only distract from the main point. In fact I should just "fix" my script to not count any commits from @web-flow at all, then I don't need to mention this at all because there aren't any false positives anymore.

Note that these false positives in my script are completely unrelated to the false positives in the automatic notifications in NixOS/nixpkgs#118661, which do have an impact because they annoy people.

Copy link
Member

@7c6f434c 7c6f434c Jul 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The false positives are relevant for two reasons.

  1. We know that what this RFC seeks to forbid is rare, but we don't even know if it ever happens at all nowadays.
  2. We do not have a real option to allow but closely watch and check after the fact if the case was exceptional, because no amount of waiting before asking GitHub will give you good data. «Allow-but-monitor» could be listed as an a priori plausible alternative by the way (with these false positives explaining why it doesn't work).

ETA: also comparison of web-flow and non-web-flow «positives» now and a few years ago tells us that most of the 5% back then were true positives unless GitHub has dramatically changed the bug occurence ratio between cases without fixing the (better defined) webflow case.


## Staging workflow

The staging workflow is not affected because it [already uses pull requests](https://github.com/NixOS/nixpkgs/pull/241951) for all merges into the affected branches.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not entirely true. Yes, the manual merge into staging-next is done with PR. However, there is also a workflow that automatically merges between master, staging and staging-next. And, as merge conflicts are a thing, these are often resolved with direct pushes.

Regarding the workflow for automatic merges. Can that still be kept? I am thinking here about a potential next step, commit signing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is conflict resolution using direct pushes to staging/staging-next, or also to master?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is conflict resolution using direct pushes to staging/staging-next, or also to master?

Right, I see now that staging* is excluded. Generally speaking staging-next to master can be resolved by merging master into staging-next first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the manual merge into staging-next is done with PR.

I'm pretty sure the initial merge from staging into staging-next is done manually. We only use a PR from staging-next into master.

Is conflict resolution using direct pushes to staging/staging-next, or also to master?

Yes, merge conflict resolution is usually done with direct pushes to staging-next.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only use a PR from staging-next into master.

Right, I meant that one!


## Staging workflow

The staging workflow is not affected because it [already uses pull requests](https://github.com/NixOS/nixpkgs/pull/241951) for all merges into the affected branches.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The staging workflow is not affected because it [already uses pull requests](https://github.com/NixOS/nixpkgs/pull/241951) for all merges into the affected branches.
The staging workflow is not affected because it [already uses pull requests](https://github.com/NixOS/nixpkgs/pull/241951) for all merges into the affected branches, and the staging branches themselves are not protected.

Or any other wording to help resolve the ambiguity at first read.

@infinisil infinisil changed the title [RFC 0156] No Direct Pushes [RFC 0156] No Direct Nixpkgs Pushes Jul 23, 2023
- `release-*`: Used for stable channels

Staging branches are intentionally not included, because they will already require a pull request when they inevitably need to get merged into one of the above branches.
The same applies to similar long-term branches like `haskell-packages`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The same applies to similar long-term branches like `haskell-packages`.
The same applies to similar long-term branches like `haskell-updates`.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...as we slouch deeper into the abyss of Github addiction...

Comment on lines 22 to 26
This makes such commits susceptible to:
- Be anonymous
- Include malicious code
- Be broken
- Have poor code quality
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these would also be addressed by "Require signed commits || Require a pull request before merging".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as we use CI tied to PRs that depends on master evaluating with only specific deviations from cleanly, «be broken» does get easier to handle without direct pushes…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure signed commits doesn't make it harder to push malicious, broken or poor-quality code, why would it? The only thing it gets you is that commits can't be anonymous anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure signed commits doesn't make it harder to push malicious, broken or poor-quality code

It depends, for who. An attacker needs to have access not just to your token or ssh key for pushing, but also the signing key. If I am correct it's also not possible anymore to make changes via the API as it's lacking the signing key. Of course, our contributors can always still push something broken themselves.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yeah I didn't consider such an attack scenario here. I'll definitely add this appropriately into the Alternatives and Future Work sections.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the contrary. The crux of the issue is increasing our level of dependence on git-hub-specific features.

This can and does apply to any "forge" or any process for development.

It's not "must be github produced pull requests", but "tangible item that allows tracking the origin". If Nixpkgs had a mailing-lists development process, it would be "any change must be sent as a patch set / pull request", and thus provide the same tangible "item".

The principle behind the change does not entrenches into any specific tooling, it makes a pretty much already universal development process mandatory.

Neither does opening a PR and merging it.

This leaves a common tangible "item" to discuss about the bad contribution. Thus, the overall already near-universal development process is defined.

@infinisil do tell if you think I've misunderstood something here.

Copy link
Member Author

@infinisil infinisil Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you're on point @samueldr. And while I do like the thought of switching off GitHub and signing commits, this is not the place to discuss it (I'll add it to Alternatives/Future Work though)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The crux of the issue is increasing our level of dependence on git-hub-specific features.

I'd also add to this that while GitHub has a UI for this, this is not GitHub specific. If it's later decided to host our own copy of the canonical Nix repository somewhere, you could still get the same feature this RFC asks for with a remote commit hook.

Copy link

@ghost ghost Jul 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not "must be github produced pull requests", but "tangible item that allows tracking the origin".

The RFC does not say "tangible item that allows tracking the origin".

It says:

# Detailed design

Turn on GitHub's "Require a pull request before merging"

https://github.com/tweag/rfcs/blob/709c8979ece291291ff12da8e206cb212a14652e/rfcs/0156-no-direct-pushes.md?plain=1#L43

this is not GitHub specific

The text above specifically references git-hub.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text above specifically references git-hub.

Aw come on, this is needlessly pedantic quibbling. Yes this is a GitHub feature, and therefore the RFC text mentions (and should mention) GitHub. No this is not a GitHub specific feature, as most other git forges offer a 1:1 equivalent.

@sternenseemann
Copy link
Member

I think the RFC implementation should also include a change to ofborg making it do something useful if the master branch does not evaluate cleanly. A remaining use case for pushing to master is when fixing evaluation failures that slipped in where opening a PR literally gains you nothing but clicking around uselessly since ofborg will fail regardless of your change.

(This will of course not eliminate the annoyances of the PR workflow which is greatly exaggerated by the long CI wait time for trivial changes or trivial amends. That is definitely orthogonal to this RFC, though, and could be addressed by faster CI (if only!) or a merge queue.)

Comment on lines +128 to +132
## Emergency changes

Sometimes channels have blocking breakages and need to be fixed as soon as possible (citation needed).
Currently this can be done with direct pushes, but a pull request will be required with this proposal.
The time required to fix such breakages however is barely affected: Since there is currently no requirement for pull requests to be approved or pass CI, they can get merged immediately after opening if necessary.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sternenseemann See this section. Pull requests don't need to pass CI before they can be merged. So if you would have committed directly before, you now just need to open a PR and merge it immediately, no need to wait for CI. (That's of course not a great workflow, but that's kind of the point of this RFC, making these things more discoverable)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but my point still stands. If we are to force this workflow, it should also have benefits for the change author thelmselves—i.e. working CI. Everything else is just frustrating.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current proposal only affects about 0.05% of commits, has effectively zero downsides, and can be implemented in 1 minute, making me fairly confident that we can get this accepted quickly.

While better CI would be nice, it would expand the scope to also affect the other 99.95% of commits, involve more tradeoffs and take much longer to design and implement. I'd rather split this into a separate RFC for the future. Let's take small improvements where we can.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you say, this change affects very little and doesn't change anything about the fact that we need to trust commiters. If they are fixing a typo in the manual, a PR gives them the benefit of confirmation that they did not accidentally wreck the manual build—all we are doing is forcing them to appreciate this benefit. This is not the case in the case of an eval failure on master.

If we are to make PRs mandatory in every case, we should also make PR CI useful in every case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we probably should, but this doesn't have to creep into the scope of this RFC. PR's are useful whether CI passes or not. If you have a concrete plan on how to improve CI, feel free to open a separate RFC, it's entirely orthogonal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, PR CI is useful for you, PR discoverability is useful for everyone else—hopefully our commiters are Kantians and not frustrated by this…

@AndersonTorres
Copy link
Member

What about the pull requests in distress? We have almost literally decades of opened pull requests.

@lheckemann lheckemann added the status: open for nominations Open for shepherding team nominations label Jul 26, 2023
@lheckemann
Copy link
Member

This RFC is now open for nominations!

@piegamesde
Copy link
Member

Nominating myself. I do have ideas that go way beyond the current proposal and which will hopefully yield another RFC eventually, but I'll take any easy wins for now :)

[alternatives]: #alternatives

- It would be possible to implement a third-party interface to de-anonymize future commits (even if pushed directly to master) using the [push event GitHub webhook](https://docs.github.com/en/webhooks-and-events/webhooks/webhook-events-and-payloads#push), which includes the `sender` field to match the pushing GitHub user.
- This would not solve the other problems with direct pushes though: It still wouldn't notify others, trigger CI or be discoverable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notifications seem feasible to add if such a thing is implemented; however, I expect GitHub to lose some events on their side just as they lose PR metadata for web flow false positives (which is why the count is relevant there)

To determine whether a commit was pushed directly, the GitHub API was queried for pull requests associated with that commit (see [`associatedPullRequests`](https://docs.github.com/en/graphql/reference/objects#commit)).
If this list includes a merged pull request to the Nixpkgs master branch, the commit is known to be merged with a pull request.
Otherwise the commit could be directly pushed or be a false positives, which is why the above count is only an upper bound.
All obvious false positives ([example](https://github.com/NixOS/nixpkgs/commit/b09d18903c24b8aca88100df86aa2fdd5f05dfcd)) have been removed from this count and listing already.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All obvious false positives ([example](https://github.com/NixOS/nixpkgs/commit/b09d18903c24b8aca88100df86aa2fdd5f05dfcd)) have been removed from this count and listing already.
All 79 obvious false positives ([example](https://github.com/NixOS/nixpkgs/commit/b09d18903c24b8aca88100df86aa2fdd5f05dfcd)) have been removed from this count and listing already.

# Detailed design
[design]: #detailed-design

Turn on GitHub's "Require a pull request before merging" [branch protection rule](https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-protected-branches/managing-a-branch-protection-rule#creating-a-branch-protection-rule) for all branches whose commits get propagated into channels.
Copy link
Member

@7c6f434c 7c6f434c Jul 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Turn on GitHub's "Require a pull request before merging" [branch protection rule](https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-protected-branches/managing-a-branch-protection-rule#creating-a-branch-protection-rule) for all branches whose commits get propagated into channels.
Require the changes from committers to go through the same technical process as external contributions for all branches whose commits get propagated into channels.
In our current situation, turn on GitHub's "Require a pull request before merging" [branch protection rule](https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-protected-branches/managing-a-branch-protection-rule#creating-a-branch-protection-rule).

(I think this improves the situation from the point of view of separating we is the generic change and what is the GitHub implementation, @amjoseph-nixpkgs was rightfully complaining about the intermixing)

@ryantm
Copy link
Member

ryantm commented Aug 2, 2023

Nominating myself to help shepherd.

@ghost
Copy link

ghost commented Aug 5, 2023

What about the pull requests in distress?

Also, what about the pull requests which mysteriously vanish?

Go ahead, try looking at the PR numbered one above (NixOS/nixpkgs#66439) or below (NixOS/nixpkgs#66437); both are there.

What exactly happened here?

@piegamesde
Copy link
Member

What about the pull requests in distress?

Also, what about the pull requests which mysteriously vanish?

I don't see how either question is on topic here, tbh

@RaitoBezarius
Copy link
Member

This RFC does not touch on https://nixos.github.io/release-wiki/Release-Process.html which is a critical part of nixpkgs and requires direct pushes, at least for tagging purposes.

I don't think the RFC can enter into action without (a) updating the release wiki (b) providing the appropriate replacements as long as they are reasonable.

@infinisil
Copy link
Member Author

This was being discussed on Matrix.

And I'd love to have another shepherd on this RFC so we can move forward.

@samueldr
Copy link
Member

@infinisil an option that could exist to help, is an additional organization group for "trusted pushers" that can continue doing direct pushes, meaning that default committer access is not part of it.

@7c6f434c
Copy link
Member

Is tag pushing restricted by branch protection? Do I understand correctly that a release branch is initialised as a copy of master with no initial difference in commits?

(But RFC will probably have to say something about both)

@infinisil
Copy link
Member Author

After yesterday's chat on Matrix, @zimbatm rightfully pointed out that such simple decisions shouldn't get deadlocked on RFCs. So we decided to just try out enabling such branch protections for some time, see if it causes any problems, and remove it again if it causes problems (very unlikely): NixOS/nixpkgs#249117. And @RaitoBezarius was given exceptional direct push access until the release process doesn't use direct pushes anymore.

I thank @zimbatm to probably all save us a whole bunch of time by not having to go through the RFC process for this!

@infinisil infinisil closed this Aug 14, 2023
@infinisil infinisil deleted the no-direct-pushes branch August 14, 2023 13:52
@infinisil infinisil restored the no-direct-pushes branch August 14, 2023 14:55
@infinisil
Copy link
Member Author

Not to self: Don't delete the branch, this RFC's contents are being referenced by NixOS/nixpkgs#249117

@infinisil
Copy link
Member Author

Removal of the direct push detection workflow: NixOS/nixpkgs#249151

@Ma27
Copy link
Member

Ma27 commented Aug 14, 2023

Is the branch-protection that was just enabled also the reason for the "auto-merge" and blocking PRs with pending requested changes (even if I'm the one who filed the change requests)?

Don't get me wrong, I'm not necessarily opposed to that[1], however as a committer I think it's reasonable to know about all changes made to the GH repo (to make sure I don't miss any change in process and mess something up) and generally I'd appreciate it if committers would at least be informed after doing that rather than having everyone to find out on their own why things are looking (and probably behaving) different.

[1] Though I'm somewhat skeptical about the blocked merge in case of pending change requests

@infinisil
Copy link
Member Author

@Ma27 Oh that's a good point, but let's discuss it here: NixOS/nixpkgs#249117 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: open for nominations Open for shepherding team nominations
Projects
None yet