Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easier to write value transformers #3155

Open
bgrant0607 opened this issue May 13, 2022 · 6 comments
Open

Make it easier to write value transformers #3155

bgrant0607 opened this issue May 13, 2022 · 6 comments
Assignees
Labels
area/fn-sdk Typescript SDK documentation Improvements or additions to documentation enhancement New feature or request triaged Issue has been triaged by adding an `area/` label

Comments

@bgrant0607
Copy link
Contributor

Value transformers are critical to the power of kustomize and kpt, which really differentiate the approaches from monolithic configuration generators. We're going to need to build a lot more of them, and we need to make it as easy as possible to write such transformers, including guides (documentation, examples, videos, etc.), so that users don't fall back to less Config-as-Data-friendly approaches.

We’ve implemented several approaches to value propagation.

kustomize:

  1. Kustomize transformer plugins (guide, example)
  2. Kustomize vars (templates, more or less)
  3. Kustomize replacement transformer (example use)

kpt:
4. Pre-1.0 kpt setters and substitutions
5. Kpt 1.0 setters
6. Catalog specific-purpose transformers, such as set-namespace

The appealing aspect of (1) is that configuration authors and consumers only need to specify the values to propagate, not where to propagate them to. Where they need to be propagated to is consistent for a given type of value and a given resource type, or set of resource types for horizontal transforms.

Transforms that apply to many resource types or all types need to be written in a different manner than those that apply to one or two specific types. Some transformers are a mix of the two styles.

Perhaps we could even make a declarative value propagator function, like a more powerful replacement transformer with configuration that's reusable across packages.

The transformer functions require understanding the schemas of applicable resource types, at least the field paths of interest. Eventually we should ensure that types of interest publish sufficient information attached to the OpenAPI schemas and that kpt can read it, or that there's a standard way to patch OpenAPI schemas with such information, but for now we should use an approach similar to kustomize, and enumerate relevant fields in their varying representations, and also support user-specified paths (again, in a way that's reusable across packages).

We will eventually want a better mechanism for defining value types and mapping resource attributes (by value type and by resource type) to them so that users can easily add support for additional resource types (skaffold image replacement example, ClusterAPI example). As one example, kustomize’s object reference and label selector transformations don’t work for arbitrary resource types out of the box because it only has data for built-in Kubernetes types.

To simplify targeting/filtering expressions, we should explore leveraging the package structure (nested subpackages), but we’ll need to look more at example packages to validate that. Hierarchical dictionaries of facts are common in configuration systems: hiera, facter.

In addition to performing the replacements, we’ll also want to be able to identify replacement locations, such as for documentation and introspection.

Some example functions that apply to non-built-in types we could try to write include #2562, #2658, #2677, #1400.

@bgrant0607 bgrant0607 added documentation Improvements or additions to documentation enhancement New feature or request labels May 13, 2022
@yuwenma yuwenma self-assigned this May 13, 2022
@droot droot added area/fn-sdk Typescript SDK triaged Issue has been triaged by adding an `area/` label labels May 17, 2022
@bgrant0607
Copy link
Contributor Author

bgrant0607 commented Jun 6, 2022

One issue that came up when users tried set-namespace, set-labels, and set-image was the need for more flexibility in specifying where the input values should come from. Otherwise, the replacement transformer was needed to propagate the values first, which would be a lot of friction, in general.
cc @droot

@bgrant0607
Copy link
Contributor Author

Another example from slack:

apiVersion: kpt.dev/v1
kind: Kptfile
metadata:
  name: base
  annotations:
    config.kubernetes.io/local-config: "true"
info:
  description: kpt package for provisioning a deployment
pipeline:
  mutators:
  - image: gcr.io/kpt-fn/set-image:v0.1.1
    configMap:
      name: bla/a
      newName: newblabla/a
      newTag: latest
  - image: gcr.io/kpt-fn/set-image:v0.1.1
    configMap:
      name: bla/b
      newName: newblabla/b
      newTag: latest

This is another case where it would be convenient for functions to be more flexible regarding where their inputs come from, so the new repo could be specified separately, in a deployment-specific configuration. (It would also be nice to be able to separately specify the repo separately from the image name. https://catalog.kpt.dev/set-image/v0.1/set-image-advanced/)

One general approach we've been considering is capturing "context" in well defined types, to better enable plug&play between functions, where some would produce values and others would consume them, with less plumbing than typically required in other tools. Gcloud PoC: https://github.com/GoogleContainerTools/kpt/pull/2715/files. I probably should file an issue to document that pattern.

A simple example of this approach is used in the variant constructor pattern, using the package name as context: https://kpt.dev/guides/variant-constructor-pattern.

In this case, the "context" is the repo name. If that were defined in a client-side type (or, potentially, ConfigMap with well known name and key, as we currently do for the package name), then functions and tools could operate on that type, much as the Backstage plugin understands the ApplyReplacements type. For instance, a simple command, script, or function could generate or update that type with the repo name. set-repo.sh, or somesuch.

For comparison, in helm, it would need to be updated in values.yaml or specified on the helm command line. Templating values.yaml has been rejected many times (helm/helm#6876). If there's a tool for helm similar to tfvar (https://dev.to/shihanng/tfvar-a-tool-to-help-you-write-terraform-s-variable-definitions-1j65), I haven't seen it. Other than just yq.

@yuwenma sketched how a Go set-repo function could be structured:

type SetRepo struct {
	Repos map[string]string `json:"repos,omitempty"`
}

func Run(r *fn.ResourceList) (bool, error) {
    for _, kubeObject := range rl.Items {
        // if condition to decide whether kubeobject has repo to change
        if kubeObject.Where(//YOUR FUNC TO FIND REPO) {
            kubeObject.SetNestedStringOrDie(r.Repos["repo1"], // FIELDS TO REPO)
        }
    }
    return true, nil
}


func main() {
	if err := fn.AsMain(&SetRepo{}); err != nil {
		os.Exit(1)
	}
}

@mikebz
Copy link
Contributor

mikebz commented Jun 9, 2022

I think the problem with even the simplest SDK is that I have to build the image and publish it. I would advocate that for scenarios like that and internal functions the easiest thing to do is to build in Starlark and run it without a container. kpt simple transformer can not be much harder than bash + sed.

Templating of Helm values.yaml does not make sense because you can just put the right value in the values.yaml file and update the template to put it in the right places. It's not much more difficult than sed.

@yuwenma
Copy link
Contributor

yuwenma commented Jun 9, 2022

I think the problem with even the simplest SDK is that I have to build the image and publish it. I would advocate that for scenarios like that and internal functions the easiest thing to do is to build in Starlark and run it without a container.

@mikebz That's right. We want users to use starlark for simple scenarios and go SDK for more complicated cases. Though we have a pile of KRM function tasks and I don't think starlark is more mature than go SDK to recommend. Another reason is that the above use case is more like a set-image transformation, which assume might need to import some logic from the set-image, which is in Go.

kpt simple transformer can not be much harder than bash + sed.

I like this description. It can be the standards to evaluate our starlark SDK.

@bgrant0607
Copy link
Contributor Author

One original motivation for templating values.yaml was because the files can get huge. The ability to merge multiple values.yaml files mitigated that issue.

However, if a value is in an environment variable, then automatically getting it into values.yaml, such as in a CI/CD pipeline, has the same challenge we're discussing here with kpt.

I understand the Starlark SDK is less mature.

kpt still supports exec functions, doesn't it? So I think even with Go a container isn't required.

We could also make it easy to build CLIs that source, run functions, then sink, without needing to invoke the kpt CLI.

But, yes, yq or envsubst could work in this case, especially if the configuration could live outside the Kptfile.

That reminds me that we could use a ConfigMap generator: #3119.

@droot
Copy link
Contributor

droot commented Jun 9, 2022

I think it's worth sharing the recipe of propagating values from environment variable to any KRM fields. For example, this use-case can be addressed by invoking search-replace function imperatively. I am assuming that substituting repo name in the deployment spec needs to be done once.

Here are the steps involved:

Assuming following Kptfile in the app-base package:

# Kptfile in app-base package
apiVersion: kpt.dev/v1
kind: Kptfile
metadata:
  name: app-base
  annotations:
    config.kubernetes.io/local-config: "true"
info:
  description: kpt package for provisioning a deployment
pipeline:
  mutators:
  - image: gcr.io/kpt-fn/set-image:v0.1.1
    configMap:
      name: bla/a
      newName: newblabla/a
      newTag: latest
  - image: gcr.io/kpt-fn/set-image:v0.1.1
    configMap:
      name: bla/b
      newName: newblabla/b
      newTag: latest

Assume thatapp-base also contains deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: frontend
  name: frontend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: frontend
  strategy: {}
  template:
    metadata:
      labels:
        app: frontend
    spec:
      containers:
      - name: frontend
        image: bla/a
      - name: backend
        image: bla/b

Let's create deployable instance from the app-base blueprint:

$ kpt pkg get <app-base> --for-deployment

# Assuming REPO_NAME environment variable contains the name `app1`
$ kpt fn eval -i search-replace:v0.2.0 -- by-value="newblabla/a" put-value="$REPO_NAME/a"
[RUNNING] "gcr.io/kpt-fn/search-replace:v0.2.0"
[PASS] "gcr.io/kpt-fn/search-replace:v0.2.0" in 1.1s
  Results:
    [info] pipeline.mutators[0].configMap.newName: Mutated field value to "app1/a"

$ kpt fn eval -i search-replace:v0.2.0 -- by-value="newblabla/b" put-value="$REPO_NAME/b"
[RUNNING] "gcr.io/kpt-fn/search-replace:v0.2.0"
[PASS] "gcr.io/kpt-fn/search-replace:v0.2.0" in 900ms
  Results:
    [info] pipeline.mutators[1].configMap.newName: Mutated field value to "app1/b"

# Now let's run render on it.
$ kpt fn render
Package "app-base":
[RUNNING] "gcr.io/kpt-fn/set-image:v0.1.1"
[PASS] "gcr.io/kpt-fn/set-image:v0.1.1" in 900ms
  Results:
    [info] apps/v1/Deployment/frontend spec.template.spec.containers.image: set image from bla/a to app1/a:latest
[RUNNING] "gcr.io/kpt-fn/set-image:v0.1.1"
[PASS] "gcr.io/kpt-fn/set-image:v0.1.1" in 700ms
  Results:
    [info] apps/v1/Deployment/frontend spec.template.spec.containers.image: set image from bla/b to app1/b:latest

Successfully executed 2 function(s) in 1 package(s).

Side note: I just realized set-image isn't very useful for declarative workflow because now images field have new values, kpt fn render re-runs won't do anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/fn-sdk Typescript SDK documentation Improvements or additions to documentation enhancement New feature or request triaged Issue has been triaged by adding an `area/` label
Projects
Development

No branches or pull requests

4 participants