Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0140] Simple Package Paths #140

Merged
merged 25 commits into from
Jul 12, 2023
Merged
Changes from 3 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
3ca531a
RFC 140
infinisil Jan 30, 2023
269ac4f
Minor improvements
infinisil Jan 30, 2023
21c3493
Minor nits
infinisil Feb 2, 2023
e203231
Update co-authors and add pre-RFC reviewers
infinisil Mar 27, 2023
967beb1
pkg-fun.nix -> package.nix
infinisil Mar 27, 2023
b077e2f
Mid-sized refactor for improved clarity and incorporating feedback
infinisil Apr 17, 2023
1d7e66a
Re-add custom argument exception and various minor improvements
infinisil May 11, 2023
0054e7a
shard distribution stats, cleanup, more uniformity
infinisil May 11, 2023
ffad8ea
Readd accidentally removed definition
infinisil May 11, 2023
66f0225
Mention package variants
infinisil May 11, 2023
89ea8e7
Minor moving and formatting
infinisil May 11, 2023
4426f20
Changes from feedback in the meeting
infinisil May 12, 2023
4f0c06f
Link to demonstration of cherry-picking without problems
infinisil May 15, 2023
98f8287
Link to demonstrates of problematic/non-problematic Git operations
infinisil May 15, 2023
e6bd2f5
Names must be unique when lowercased
infinisil May 16, 2023
632fb4d
Properly close invisible anchor
infinisil May 17, 2023
71e40dc
Update summary to mention Nixpkgs more explicitly
infinisil May 17, 2023
46c23ae
Include more arguments and counter-arguments for pkgs/unit alternatives
infinisil May 17, 2023
f7c3056
Add shepherd team and nicks
infinisil May 17, 2023
f53a862
Convert frontmatter to a table
infinisil May 17, 2023
e5b2019
Fix table rendering
infinisil May 17, 2023
b72d482
Minor fixups
infinisil May 29, 2023
7a03d43
Explain unit and add more alternatives
infinisil Jun 9, 2023
13c4f39
unit -> by-name, remove "standard"
infinisil Jun 16, 2023
f15c1cf
Apply suggestions from code review
infinisil Jun 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
234 changes: 234 additions & 0 deletions rfcs/0140-simple-package-paths.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
---
feature: simple-package-paths
start-date: 2022-09-02
author: Silvan Mosberger
co-authors: Nixpkgs Architecture Team
infinisil marked this conversation as resolved.
Show resolved Hide resolved
shepherd-team: (names, to be nominated and accepted by RFC steering committee)
shepherd-leader: (name to be appointed by RFC steering committee)
related-issues: https://github.com/NixOS/nixpkgs/pull/211832
---

# Summary
[summary]: #summary

Auto-generate trivial top-level attribute definitions in `pkgs/top-level/all-packages.nix` from a directory structure that matches the attribute name.
This makes it much easier to contribute new packages, since there's no more guessing needed as to where the package should go, both in the ad-hoc directory categories and in `pkgs/top-level/all-packages.nix`.

# Motivation
[motivation]: #motivation

- (Especially new) package contributors are having a hard time figuring out which files to add or edit. These are very common questions:
infinisil marked this conversation as resolved.
Show resolved Hide resolved
- Which directory should my package definition go in?
- What are all the categories and do they matter?
- What if the package has multiple matching categories?
- Why can't I build my package after adding the package file?
- Where in all-packages.nix should my package go?
- Figuring out where an attribute is defined is a bit tricky:
- First one has to find the definition of it in all-packages.nix to see what file it refers to
- On GitHub this is even more problematic, as the `all-packages.nix` file is [too big to be displayed by GitHub](https://github.com/NixOS/nixpkgs/blob/nixos-22.05/pkgs/top-level/all-packages.nix)
- Then go to that file's definition, which takes quite some time for navigation (unless you have a plugin that can jump to it directly)
- It also slows down or even locks up editors due to the file size
- `nix edit -f . package-attr` works, though that's not yet stable (it relies on the `nix-command` feature being enabled) and doesn't work with packages that don't set `meta.position` correctly).
- `all-packages.nix` frequently causes merge conflicts. It's a point of contention for all new packages

# Detailed design
[design]: #detailed-design

This RFC establishes the standard of using `pkgs/unit/${shard}/${name}` "unit" directories for the definitions of the Nix packages `pkgs.${name}` in Nixpkgs, where `shard = toLower (substring 0 2 name)`.
All unit directories are automatically discovered and incorporated into the `pkgs` set using `pkgs.${name} = pkgs.callPackage pkgs/unit/${shard}/${name}/pkg-fun.nix { }`.
Copy link
Member Author

@infinisil infinisil Mar 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@piegamesde See here, that's exactly what's done. pkg-fun.nix (or newly package.nix) is just the old default.nix (we could also call it default.nix, see alternatives section). And note how the second argument is {}, meaning the auto-calling only works for packages that don't require any arguments. However see the Custom arguments section, which allows you to also call the unit directories from all-packages.nix if you need to pass a non-{} argument.


The following requirements will be checked by CI.
This standard must be followed for newly added packages that can satisfy these requirements.
A treewide migration to this standard will be performed for existing packages that can satisfy these requirements.

## Structure

The `pkgs/unit` directory must only contain unit directories, and only in subdirectories of the form `${shard}/${name}`.
Each unit directory must contain at least a `pkg-fun.nix` file, but may contain arbitrary other files and directories.
infinisil marked this conversation as resolved.
Show resolved Hide resolved

This ensures that maintainers don't have to verify this structure manually, which is prone to mistakes.

## Only derivations

If `pkgs/unit/${shard}/${name}` exists, `pkgs.${name}` must be a derivation that can be built directly with `nix-build`.

This ensures that people can expect the unit directories to correspond to buildable packages and not functions like `pkgs.fetchFromGitHub` or `pkgs.buildRustCrate`.

## Stable boundary
infinisil marked this conversation as resolved.
Show resolved Hide resolved

Unit directories may only interact with the rest of Nixpkgs via the stable `pkgs.${name}` attributes, not with file references:
- Files inside a unit directory must not reference files outside that unit directory.
infinisil marked this conversation as resolved.
Show resolved Hide resolved
Therefore all dependencies on other packages must come from `pkg-fun.nix` arguments injected by `callPackage`.
This ensures that files in Nixpkgs can be moved around without breaking this package.
- Files outside a unit directory must not reference files inside that unit directory.
Therefore other packages can only depend on this package via `pkgs.${name}`.
This ensures that files within unit directories (except `pkg-fun.nix`) can be freely moved and changed without breaking any other packages.

The only notable exception to this rule is the `pkgs/top-level/all-packages.nix` file which may reference the `pkg-fun.nix` file according to the next requirement.

## Custom arguments

If `pkgs/top-level/all-packages.nix` contains a definition for the attribute `${name}` and the unit directory `pkgs/unit/${shard}/${name}` exists, then the attribute value must be defined as `pkgs.callPackage pkgs/unit/${shard}/${name}/pkg-fun.nix args`, where `args` may be a freely chosen expression.

This ensures that even if a package initially doesn't require a custom `args`, if it later does, it doesn't have to be moved out of the `pkgs/unit` directory to pass custom arguments.

## Examples
[examples]: #examples

To add a new package `pkgs.foobar` to Nixpkgs, one only needs to create the file `pkgs/unit/fo/foobar/pkg-fun.nix`.
No need to find an appropriate category nor to modify `pkgs/top-level/all-packages.nix` anymore.

With many packages, the `pkgs/unit` directory may look like this:

```
pkgs
└── unit
├── _0
│ ├── _0verkill
│ └── _0x
infinisil marked this conversation as resolved.
Show resolved Hide resolved
├── ch
│ ├── ChowPhaser
│ ├── CHOWTapeModel
│ ├── chroma
│ ┊
├── t
│ └── t
```

# Interactions
[interactions]: #interactions

- `nix edit` and search.nixos.org are unaffected, since they rely on `meta.position` to get the file to edit, which still works
infinisil marked this conversation as resolved.
Show resolved Hide resolved
- `git blame` locally and on GitHub is unaffected, since it follows file renames properly.
infinisil marked this conversation as resolved.
Show resolved Hide resolved
- A commonly recommended way of building package directories in Nixpkgs is to use `nix-build -E 'with import <nixpkgs> {}; callPackage pkgs/applications/misc/hello {}'`.
Since the path changes `pkg-fun.nix` is now used, this becomes like `nix-build -E 'with import <nixpkgs> {}; callPackage pkgs/unit/he/hello/pkg-fun.nix {}'`, which is harder for users.
However, calling a path like this is an anti-pattern anyway, because it doesn't use the correct Nixpkgs version and it doesn't use the correct argument overrides.
The correct way of doing it was to add the package to `pkgs/top-level/all-packages.nix`, then calling `nix-build -A hello`.
This `nix-build -E` workaround is partially motivated by the difficulty of knowing the mapping from attributes to package paths, which is what this RFC improves upon.
By teaching users that `pkgs/unit/*/<name>` corresponds to `nix-build -A <name>`, the need for such `nix-build -E` workarounds should disappear.

# Drawbacks
infinisil marked this conversation as resolved.
Show resolved Hide resolved
[drawbacks]: #drawbacks

- The existing categorization of packages gets lost. Counter-arguments:
- It was never that useful to begin with
- The categorization was always incomplete, because packages defined in the language package sets often don't get their own categorized file path.
- It was an inconvenient user interface, requiring a checkout or browsing through GitHub
- Many packages fit multiple categories, leading to multiple locations to search through instead of one
- There's other better ways of discovering similar packages, e.g. [Repology](https://repology.org/)
infinisil marked this conversation as resolved.
Show resolved Hide resolved
infinisil marked this conversation as resolved.
Show resolved Hide resolved
- This breaks `builtins.unsafeGetAttrPos "hello" pkgs`. Counter-arguments:
- We have to draw a line as to what constitutes the public interface of Nixpkgs. We have decided that making attribute position information part of that is not productive. For context, this information is already accepted to be unreliable at the language level, noting the `unsafe` part of the name.
- Support for this could be added to Nix (make `builtins.readDir` propagate file as a position)

# Alternatives
[alternatives]: #alternatives

## Alternate `pkgs/unit` structure

- Use a flat directory, e.g. `pkgs.hello` would be in `pkgs/unit/hello`.
- Good because it's simpler, both for the user and for the code
- Bad because the GitHub web interface only renders the first 1 000 entries (and we have about 10 000 that benefit from this transition, even given the restrictions)
- Bad because it makes `git` and file listings slower
- Use `substring 0 3 name` or `substring 0 4 name`. This was not done because it still leads to directories in `pkgs/unit` containing more than 1 000 entries, leading to the same problems.
- Use multi-level structure, like a 2-level 2-prefix structure where `hello` is in `pkgs/unit/he/ll/hello`,
if packages are less than 4 characters long, we will it out with `-`, e.g. `z` is in `pkgs/unit/z-/--/z`.
This is not great because it's more complicated, longer to type and it would improve performance only marginally.
- Use a dynamic structure where directories are rebalanced when they have too many entries.
E.g. `pkgs.foobar` could be in `pkgs/unit/f/foobar` initially.
But when there's more than 1 000 packages starting with `f`, all packages starting with `f` are distributed under 2-letter prefixes, moving `foobar` to `pkgs/unit/fo/foobar`.
This is not great because it's very complex to determine which directory to put a package in, making it bad for contributors.

## Alternate `pkg-fun.nix` filename

- `default.nix`:
- Bad because it doesn't have its main benefits here:
- Removing the need to specify the file name in expressions, but this does not apply because this file will be imported automatically by the code that replaces definitions from `all-packages.nix`.
- Removing the need to specify the file name on the command line, but this does not apply because a package function must be imported into an expression before it can be used, making `nix build -f pkgs/unit/hell/hello` equally broken regardless of file name.
- Not using `default.nix` frees up `default.nix` for a short expression that is actually buildable, e.g. `(import ../.. {}).hello`, although we don't yet have a use case for this that isn't covered by `nix-build ../.. -A $attrname`.
- Bad because using `default.nix` would tempt users to invoke `nix-build .` whereas making package functions auto-callable is a known anti-pattern as it duplicates the defaults.
- Good because `default.nix` is already a convention most people are used to
- `package.nix`/`pkg.nix`: Bad, because it makes the migration to a non-function form of overridable packages harder in the future.
infinisil marked this conversation as resolved.
Show resolved Hide resolved

## Alternate `pkgs/unit` location

- Use `unit` (at the Nixpkgs root) instead of `pkgs/unit`.
This is future proof in case we want to make the directory structure more general purpose, but this is out of scope
- Other name proposals were deemed worse: `pkg`, `component`, `part`, `mod`, `comp`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to nominate auto, to me this is more future-proof than unit so we can include multiple packages in one directory

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we can include multiple packages in one directory

How do you mean that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, we currently have element-desktop and element-web that share the same pin.json (or generic.nix in some other cases), in the future we might want to allow that to be defined in pkgs/unit, but I'm not quite sure the name unit fits the behavior of having multiple packages under the same directory, and auto is more future-proof in that case IMO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yeah so this has been discussed before. I believe in the team the consensus was that multiple versions and variants should eventually end up in the same pkgs.element attribute, once we have the design for that figured out. In that case, all those variants would naturally be together in the pkgs/unit/el/element directory.

That's also kind of why unit was chosen, we talked about explaining it better before, something like:

unit: A collection of standardized files related to the same software component

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense to me, thanks for the explanation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming is really important, especially for something as ubiquitous. Please add the rationale for choosing "unit" over the other proposals, just as you did for all the other decisions. Otherwise it will produce guesswork, confusion, and needless resistance in readers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't dislike "auto" but I'd like to note that the name implies the package is automatically called. This means that packages with custom arguments would not fit in there anymore, unlike with the current proposal. Maybe they'd be put into "pkgs/manual" instead then? (which then opens up to confusion with manual as in documentation, etc.)


## Filepath backwards-compatibility

Additionally have a backwards-compatibility layer for moved paths, such as a symlink pointing from the old to the new location, or for Nix files even a `builtins.trace "deprecated" (import ../new/path)`.
- We are not doing this because it would give precedent to file paths being a stable API interface, which definitely shouldn't be the case (bar some exceptions).
- It would also lead to worse merge conflicts as the transition is happening, since Git would have to resolve a merge conflict between a symlink and a changed file.

## Not having the [reference requirement](#user-content-req-ref)

The reference requirement could be removed, which would allow unit directories to reference files outside themselves, and the other way around. This is not great because it encourages the use of file paths as an API, rather than explicitly exposing functionality from Nix expressions.

## Restrict design to try delay issues like "package variants" {#no-variants}

We perceived some uncertainty around [package variants](#def-package-variant) that led us to scope these out at first, but we did not identify a real problem that would arise from allowing non-auto-called attributes to reference `pkgs/unit` files. However, imposing unnecessary restrictions would be counterproductive because:

- The contributor experience would suffer, because it won't be obvious to everyone whether their package is allowed to go into `pkgs/unit`. This means that we'd fail to solve the goal "Which directory should my package definition go in?", leading to unnecessary requests for changes in pull requests.

- Changes in dependencies can require dependents to add an override, causing packages to be moved back and forth between unit directories and the general `pkgs` tree, worsening the problem as people have to decide categories *again*.

- When lifting the restriction, the reviewers have to adapt, again leading to unnecessary requests for changes in pull requests.

- We'd be protracting the migration by unnecessary gatekeeping or discovering some problem late.

That said, we did identify risks:

- We might get something wrong, and while we plan to incrementally migrate Nixpkgs to this new system anyway, starting with fewer units is good.
- Mitigation: only automate the renames of simple (`callPackage path { }`) calls, to keep the initial change small

- We might not focus enough on the foundation, while we could more easily relax requirements later.
- After more discussion, we feel confident that the manual `callPackage` calls are unlikely to cause issues that we wouldn't otherwise have.

## Recommend a `callPackage` pattern with default arguments

> - While this RFC doesn't address expressions where the second `callPackage` argument isn't `{}`, there is an easy way to transition to an argument of `{}`: For every attribute of the form `name = attrs.value;` in the argument, make sure `attrs` is in the arguments of the file, then add `name ? attrs.value` to the arguments. Then the expression in `all-packages.nix` can too be auto-called
> - Don't do this for `name = value` pairs though, that's an alias-like thing

`callPackage` does not favor the default argument when both a default argument and a value in `pkgs` exist. Changing the semantics of `callPackage` is out of scope.

## Allow `callPackage` arguments to be specified in `<unit>/args.nix`

The idea was to expand the auto-calling logic according to:

Unit directories are automatically discovered and transformed to a definition of the form
```
# If args.nix doesn't exist
pkgs.${name} = pkgs.callPackage ${unitDir}/pkg-fun.nix {}
# If args.nix does exists
pkgs.${name} = pkgs.callPackage ${unitDir}/pkg-fun.nix (import ${unitDir}/args.nix pkgs);
```

Pro:
- It makes another class of packages uniform, by picking a solution with restricted expressive power.

Con:
- It does not solve the contributor experience problem of having too many rules.
- `args.nix` is another pattern that contributors need to learn how to use, as we have seen that it is not immediately obvious to everyone how it works.
- A CI check can mitigate the possible lack of uniformity, and we see a simple implementation strategy for it.
- This keeps the contents of the unit directories simple and a bit more uniform than with optional `args.nix` files.
infinisil marked this conversation as resolved.
Show resolved Hide resolved

# Unresolved questions
[unresolved]: #unresolved-questions
infinisil marked this conversation as resolved.
Show resolved Hide resolved

# Future work
infinisil marked this conversation as resolved.
Show resolved Hide resolved
[future]: #future-work
infinisil marked this conversation as resolved.
Show resolved Hide resolved

All of these questions are in scope to be addressed in future discussions in the [Nixpkgs Architecture Team](https://nixos.org/community/teams/nixpkgs-architecture.html):

- Making the filetree more human-friendly by grouping files together by "topic" rather than technical delineations.
For instance, having a package definition, changelog, package-specific config generator and perhaps even NixOS module in one directory makes work on the package in a broad sense easier.
- This RFC only addresses the top-level attribute namespace, aka packages in `pkgs.<name>`, it doesn't do anything about package sets like `pkgs.python3Packages.<name>`, `pkgs.haskell.packages.ghc942.<name>`, which may or may not also benefit from a similar auto-calling
- Improve the semantics of `callPackage` and/or apply a better solution, such as a module-like solution

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the solutions discussed so far present an elegant solution, and I don't think one exists. Module solution sounds like it would be quite fitting here (although out of scope of the RFC), so leaving this here for reference:

https://www.youtube.com/watch?v=dTd499Y31ig

I think that would solve the jami-daemon = jami.jami-daemon problem by itself, and very possibly wlroots = wlroots_0_14 if version will become an option that has meaning.

Copy link
Member

@phaer phaer May 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module solution sounds like it would be quite fitting here (although out of scope of the RFC),

Yes, I that's probably out of scope for this PR, but there's relatively fresh working group about exactly this topic. We have a matrix room, weekly meetings on Friday and a Repo to manage things, see https://github.com/nixpkgs-architecture/pkgs-modules

EDIT: maybe we could reference the WG here?

- What to do with different versions, e.g. `wlroots = wlroots_0_14`? This goes into version resolution, a different problem to fix
- What to do about e.g. `libsForQt5.callPackage`? This goes into overrides, a different problem to fix
- What about aliases like `jami-daemon = jami.jami-daemon`?
Copy link

@quantenzitrone quantenzitrone May 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about another standardized autocalled file aliases.nix that contains aliases for packages and calls with custom arguments?
This way, everything belonging to a package would still be in the unit directory.
The aliases.nix will not be allowed to create an attribute with the same name as the containing folder:
For example: .../ne/nerdfonts/aliases.nix could look something like this:

{
  lib,
  pkgs,
  ...
}:
{
  # this is not allowed to contain a attribute named "nerdfonts"
  # because the default nerdfonts package (aka `callPackage` with no arguments) is autocalled
  nerdfonts_2 = "<some nerdfonts override for version 2.3.3 because version 3 had breaking changes>";
}
// (
  # i havent tested this but in theory this should create a <name>-nerdfonts package for every nerdfont
  # like there currently exists for terminus-nerdfonts and inconsolata-nerdfonts (in all-packages.nix)
  with lib;
    mapAttrs' (
      name: value: {
        name = "${toLower name}-nerdfonts";
        value = pkgs.callPackage ./package.nix {
          fonts = [name];
        };
      }
    )
    (import ./shas.nix)
)

To keep the scope small, that would be an idea for another follow-up RFC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep the scope small, that would be an idea for another follow-up RFC.

I agree.

I believe such an idea has been considered, but was rejected for scope reasons, and a potential performance issue.

As these files would each be able to affect the top level attrNames, they would all have to be parsed before constructing "pkgs". So far we've kept the proposal from having to scan the contents of all unit directories when evaluating a limited set of attribute values. Actually parsing files in many of them would change that. Most practical usages do not involve evaluating all of Nixpkgs, so we should optimize to keep "individual" package evaluations fast.
The actual performance impact has not been measured though, nor have we assessed the changes on the human side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an answer missing here ;) I believe it might be good enough to keep them in all-packages.nix, just as with custom args, for now, in order to keep the scope small?

- What about `recurseIntoAttrs`? Not single packages, package sets, another problem