Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup and Stability Effort #6

Closed
12 of 21 tasks
maxmcd opened this issue Sep 2, 2020 · 5 comments
Closed
12 of 21 tasks

Cleanup and Stability Effort #6

maxmcd opened this issue Sep 2, 2020 · 5 comments

Comments

@maxmcd
Copy link
Owner

maxmcd commented Sep 2, 2020

Most of the desired initial features are now complete, but there are a lot of gnarly edges. Seems it would make sense to do a cleanup and stability sprint. Things to cover:

  • variable expansion is complex, figure out how env vars and $bramble_path work with each other
  • implement garbage collection
  • improve docs
    • no incorrect code in examples (implement tests in docs)
    • update various notes files as well
    • rewrite readme
    • add tons of examples
  • test that derivations can actually rebuild when needed
  • add lockfile and url hashes to lockfile
  • write lots of tests in bramble
  • start supporting different systems, get osx working
  • lazy derivation execution
    • cmd call triggers supporting derivation builds
  • multiple outputs created and working
  • can tests be run in any file? rust can do it, check zig?
  • debugger, open shell on excution error Possible to create a simple interactive debugger? google/starlark-go#304
  • parallel building
  • build a docker image (stability and cleanup huh?)
  • actually make derivations and cmd impossible to run at the top level
  • implement output path replacement for self-referential outputs
  • no automatic variable expansion in input functions
  • fork starlark-go and reimplement load function
@maxmcd
Copy link
Owner Author

maxmcd commented Sep 12, 2020

Note on lazy derivation evaluation:

Calling the derivation function does input validation on params and args. Looks at all derivations used as input. If there is a function the function is called to check if the return value is a derivation (or list of derivations). These derivations are children of this derivation. Add them to the graph and move on to the next derivation. Connect all future derivations to the graph.

Once we hit a cmd that wants to run the derivation, or if we return the final derivation from a function, we check where this last derivation is on the graph. then we start from the leaves of the tree and start building in parallel but crawling the DAG. If there is an error we stop all jobs and exit with the error for that derivation (unless another mode is specified).

@maxmcd
Copy link
Owner Author

maxmcd commented Sep 13, 2020

the build path must be in the bramble store in case the path is used in the output

@maxmcd
Copy link
Owner Author

maxmcd commented Sep 13, 2020

https://edolstra.github.io/pubs/phd-thesis.pdf
6.7

The outputs of a derivation can have references to each other, and in fact this is quite common. For instance, it can be expected that the programs in the bin output of Glibc depend on the libraries in the out output. This means that the out output is in the closure of the bin output, but not vice versa. But what happens when there are mutually recursive references, e.g., when out also refers to bin? These must be forbidden, since the hash rewriting scheme from Section 6.3.2 cannot handle them. For instance, when we copy out and bin to their content-addressable locations, we must rewrite in both FSOs the hashes of both paths. The function hashModulo only handles direct self-references, and it can do so because the hashes to be disregarded in the hash computation are encoded into the file name.

Fortunately, banning mutually recursive outputs is not a very onerous restriction, since they are pointless. After all, mutual recursion between output paths requires them to be deployed and garbage collected as a unit, negating the granularity advantages that multiple outputs are intended to achieve.

@maxmcd
Copy link
Owner Author

maxmcd commented Sep 13, 2020

5.4.1

If a fetchurl derivation followed the normal translation scheme, the output paths of the derivation and all derivations depending on it would change. For instance, if we were to change the URL of the Glibc source distribution—a component on which almost all other components depend—massive rebuilds will ensue. This is unfortunate for a change which we know cannot have a real effect as it propagates upwards through the dependency graph.

Fixed-output derivations solve this problem by allowing a derivation to state to Nix that its output will hash to a specific value. When Nix builds the derivation (Section 5.5), it will hash the output and check that the hash corresponds to the declared value. If there is a hash mismatch, the build fails and the output is not registered as valid. For fixed-output derivations, the computation of the output path only depends on the declared hash and hash algorithm, not on any other attributes of the derivation.

This is extremely interesting, I wonder if it's possible to take advantage of this pattern now that outputs are known and attempt to be consistent between systems. Could we store different versions of derivations on disk and link them to different outputs? Could we shortstop rebuilding in certain situations?

@maxmcd
Copy link
Owner Author

maxmcd commented Mar 7, 2021

closing this since the framing is a little outdated and general work has resumed on the main branch

@maxmcd maxmcd closed this as completed Mar 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant