Make it easier to write value transformers #3155

bgrant0607 · 2022-05-13T09:50:55Z

Value transformers are critical to the power of kustomize and kpt, which really differentiate the approaches from monolithic configuration generators. We're going to need to build a lot more of them, and we need to make it as easy as possible to write such transformers, including guides (documentation, examples, videos, etc.), so that users don't fall back to less Config-as-Data-friendly approaches.

We’ve implemented several approaches to value propagation.

kustomize:

Kustomize transformer plugins (guide, example)
Kustomize vars (templates, more or less)
Kustomize replacement transformer (example use)

kpt:
4. Pre-1.0 kpt setters and substitutions
5. Kpt 1.0 setters
6. Catalog specific-purpose transformers, such as set-namespace

The appealing aspect of (1) is that configuration authors and consumers only need to specify the values to propagate, not where to propagate them to. Where they need to be propagated to is consistent for a given type of value and a given resource type, or set of resource types for horizontal transforms.

Transforms that apply to many resource types or all types need to be written in a different manner than those that apply to one or two specific types. Some transformers are a mix of the two styles.

Perhaps we could even make a declarative value propagator function, like a more powerful replacement transformer with configuration that's reusable across packages.

Replicas example: plugin, filter, fields
Images example: plugin, filter, fields, example (it’s a problem that the kinds are not specified)

The transformer functions require understanding the schemas of applicable resource types, at least the field paths of interest. Eventually we should ensure that types of interest publish sufficient information attached to the OpenAPI schemas and that kpt can read it, or that there's a standard way to patch OpenAPI schemas with such information, but for now we should use an approach similar to kustomize, and enumerate relevant fields in their varying representations, and also support user-specified paths (again, in a way that's reusable across packages).

We will eventually want a better mechanism for defining value types and mapping resource attributes (by value type and by resource type) to them so that users can easily add support for additional resource types (skaffold image replacement example, ClusterAPI example). As one example, kustomize’s object reference and label selector transformations don’t work for arbitrary resource types out of the box because it only has data for built-in Kubernetes types.

To simplify targeting/filtering expressions, we should explore leveraging the package structure (nested subpackages), but we’ll need to look more at example packages to validate that. Hierarchical dictionaries of facts are common in configuration systems: hiera, facter.

In addition to performing the replacements, we’ll also want to be able to identify replacement locations, such as for documentation and introspection.

Some example functions that apply to non-built-in types we could try to write include #2562, #2658, #2677, #1400.

bgrant0607 · 2022-06-06T15:22:58Z

One issue that came up when users tried set-namespace, set-labels, and set-image was the need for more flexibility in specifying where the input values should come from. Otherwise, the replacement transformer was needed to propagate the values first, which would be a lot of friction, in general.
cc @droot

bgrant0607 · 2022-06-09T19:23:11Z

Another example from slack:

apiVersion: kpt.dev/v1
kind: Kptfile
metadata:
  name: base
  annotations:
    config.kubernetes.io/local-config: "true"
info:
  description: kpt package for provisioning a deployment
pipeline:
  mutators:
  - image: gcr.io/kpt-fn/set-image:v0.1.1
    configMap:
      name: bla/a
      newName: newblabla/a
      newTag: latest
  - image: gcr.io/kpt-fn/set-image:v0.1.1
    configMap:
      name: bla/b
      newName: newblabla/b
      newTag: latest

This is another case where it would be convenient for functions to be more flexible regarding where their inputs come from, so the new repo could be specified separately, in a deployment-specific configuration. (It would also be nice to be able to separately specify the repo separately from the image name. https://catalog.kpt.dev/set-image/v0.1/set-image-advanced/)

One general approach we've been considering is capturing "context" in well defined types, to better enable plug&play between functions, where some would produce values and others would consume them, with less plumbing than typically required in other tools. Gcloud PoC: https://github.com/GoogleContainerTools/kpt/pull/2715/files. I probably should file an issue to document that pattern.

A simple example of this approach is used in the variant constructor pattern, using the package name as context: https://kpt.dev/guides/variant-constructor-pattern.

In this case, the "context" is the repo name. If that were defined in a client-side type (or, potentially, ConfigMap with well known name and key, as we currently do for the package name), then functions and tools could operate on that type, much as the Backstage plugin understands the ApplyReplacements type. For instance, a simple command, script, or function could generate or update that type with the repo name. set-repo.sh, or somesuch.

For comparison, in helm, it would need to be updated in values.yaml or specified on the helm command line. Templating values.yaml has been rejected many times (helm/helm#6876). If there's a tool for helm similar to tfvar (https://dev.to/shihanng/tfvar-a-tool-to-help-you-write-terraform-s-variable-definitions-1j65), I haven't seen it. Other than just yq.

@yuwenma sketched how a Go set-repo function could be structured:

type SetRepo struct {
	Repos map[string]string `json:"repos,omitempty"`
}

func Run(r *fn.ResourceList) (bool, error) {
    for _, kubeObject := range rl.Items {
        // if condition to decide whether kubeobject has repo to change
        if kubeObject.Where(//YOUR FUNC TO FIND REPO) {
            kubeObject.SetNestedStringOrDie(r.Repos["repo1"], // FIELDS TO REPO)
        }
    }
    return true, nil
}


func main() {
	if err := fn.AsMain(&SetRepo{}); err != nil {
		os.Exit(1)
	}
}

mikebz · 2022-06-09T20:11:18Z

I think the problem with even the simplest SDK is that I have to build the image and publish it. I would advocate that for scenarios like that and internal functions the easiest thing to do is to build in Starlark and run it without a container. kpt simple transformer can not be much harder than bash + sed.

Templating of Helm values.yaml does not make sense because you can just put the right value in the values.yaml file and update the template to put it in the right places. It's not much more difficult than sed.

yuwenma · 2022-06-09T20:39:11Z

I think the problem with even the simplest SDK is that I have to build the image and publish it. I would advocate that for scenarios like that and internal functions the easiest thing to do is to build in Starlark and run it without a container.

@mikebz That's right. We want users to use starlark for simple scenarios and go SDK for more complicated cases. Though we have a pile of KRM function tasks and I don't think starlark is more mature than go SDK to recommend. Another reason is that the above use case is more like a set-image transformation, which assume might need to import some logic from the set-image, which is in Go.

kpt simple transformer can not be much harder than bash + sed.

I like this description. It can be the standards to evaluate our starlark SDK.

bgrant0607 · 2022-06-09T20:53:44Z

One original motivation for templating values.yaml was because the files can get huge. The ability to merge multiple values.yaml files mitigated that issue.

However, if a value is in an environment variable, then automatically getting it into values.yaml, such as in a CI/CD pipeline, has the same challenge we're discussing here with kpt.

I understand the Starlark SDK is less mature.

kpt still supports exec functions, doesn't it? So I think even with Go a container isn't required.

We could also make it easy to build CLIs that source, run functions, then sink, without needing to invoke the kpt CLI.

But, yes, yq or envsubst could work in this case, especially if the configuration could live outside the Kptfile.

That reminds me that we could use a ConfigMap generator: #3119.

droot · 2022-06-09T23:18:13Z

I think it's worth sharing the recipe of propagating values from environment variable to any KRM fields. For example, this use-case can be addressed by invoking search-replace function imperatively. I am assuming that substituting repo name in the deployment spec needs to be done once.

Here are the steps involved:

Assuming following Kptfile in the app-base package:

# Kptfile in app-base package
apiVersion: kpt.dev/v1
kind: Kptfile
metadata:
  name: app-base
  annotations:
    config.kubernetes.io/local-config: "true"
info:
  description: kpt package for provisioning a deployment
pipeline:
  mutators:
  - image: gcr.io/kpt-fn/set-image:v0.1.1
    configMap:
      name: bla/a
      newName: newblabla/a
      newTag: latest
  - image: gcr.io/kpt-fn/set-image:v0.1.1
    configMap:
      name: bla/b
      newName: newblabla/b
      newTag: latest

Assume thatapp-base also contains deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: frontend
  name: frontend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: frontend
  strategy: {}
  template:
    metadata:
      labels:
        app: frontend
    spec:
      containers:
      - name: frontend
        image: bla/a
      - name: backend
        image: bla/b

Let's create deployable instance from the app-base blueprint:

$ kpt pkg get <app-base> --for-deployment

# Assuming REPO_NAME environment variable contains the name `app1`
$ kpt fn eval -i search-replace:v0.2.0 -- by-value="newblabla/a" put-value="$REPO_NAME/a"
[RUNNING] "gcr.io/kpt-fn/search-replace:v0.2.0"
[PASS] "gcr.io/kpt-fn/search-replace:v0.2.0" in 1.1s
  Results:
    [info] pipeline.mutators[0].configMap.newName: Mutated field value to "app1/a"

$ kpt fn eval -i search-replace:v0.2.0 -- by-value="newblabla/b" put-value="$REPO_NAME/b"
[RUNNING] "gcr.io/kpt-fn/search-replace:v0.2.0"
[PASS] "gcr.io/kpt-fn/search-replace:v0.2.0" in 900ms
  Results:
    [info] pipeline.mutators[1].configMap.newName: Mutated field value to "app1/b"

# Now let's run render on it.
$ kpt fn render
Package "app-base":
[RUNNING] "gcr.io/kpt-fn/set-image:v0.1.1"
[PASS] "gcr.io/kpt-fn/set-image:v0.1.1" in 900ms
  Results:
    [info] apps/v1/Deployment/frontend spec.template.spec.containers.image: set image from bla/a to app1/a:latest
[RUNNING] "gcr.io/kpt-fn/set-image:v0.1.1"
[PASS] "gcr.io/kpt-fn/set-image:v0.1.1" in 700ms
  Results:
    [info] apps/v1/Deployment/frontend spec.template.spec.containers.image: set image from bla/b to app1/b:latest

Successfully executed 2 function(s) in 1 package(s).

Side note: I just realized set-image isn't very useful for declarative workflow because now images field have new values, kpt fn render re-runs won't do anything.

bgrant0607 added documentation Improvements or additions to documentation enhancement New feature or request labels May 13, 2022

yuwenma self-assigned this May 13, 2022

droot added area/fn-sdk Typescript SDK triaged Issue has been triaged by adding an `area/` label labels May 17, 2022

bgrant0607 mentioned this issue Jun 7, 2022

First-class treatment of identifier transformations #3292

Open

mortent added this to ToDo in kpt kanban board Jun 8, 2022

This was referenced Jun 30, 2022

Add ability to specify pipeline function config with selector #3304

Open

Could we just use kustomize for transformations? #3121

Open

This was referenced Jul 8, 2022

Epic: WYSIWYG Kubernetes Application Configuration #3351

Open

Document how to approach UX #3145

Open

bgrant0607 mentioned this issue Jul 20, 2022

Flesh out the input data model and patterns #3396

Open

johnbelamaric mentioned this issue May 16, 2023

CEL for function inputs #3964

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it easier to write value transformers #3155

Make it easier to write value transformers #3155

bgrant0607 commented May 13, 2022

bgrant0607 commented Jun 6, 2022 •

edited

Loading

bgrant0607 commented Jun 9, 2022

mikebz commented Jun 9, 2022

yuwenma commented Jun 9, 2022 •

edited

Loading

bgrant0607 commented Jun 9, 2022

droot commented Jun 9, 2022 •

edited

Loading

Make it easier to write value transformers #3155

Make it easier to write value transformers #3155

Comments

bgrant0607 commented May 13, 2022

bgrant0607 commented Jun 6, 2022 • edited Loading

bgrant0607 commented Jun 9, 2022

mikebz commented Jun 9, 2022

yuwenma commented Jun 9, 2022 • edited Loading

bgrant0607 commented Jun 9, 2022

droot commented Jun 9, 2022 • edited Loading

bgrant0607 commented Jun 6, 2022 •

edited

Loading

yuwenma commented Jun 9, 2022 •

edited

Loading

droot commented Jun 9, 2022 •

edited

Loading