Skip to content

Commit

Permalink
feat: Add sample front end helm chart (#320)
Browse files Browse the repository at this point in the history
**Reason for Change**:
This helm chart provides a straightforward example for deploying a
sample UI on top of a kaito workspace.
  • Loading branch information
ishaansehgal99 committed Apr 1, 2024
1 parent 08dd1f4 commit 05fb90c
Show file tree
Hide file tree
Showing 12 changed files with 439 additions and 0 deletions.
23 changes: 23 additions & 0 deletions charts/DemoUI/inference/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
15 changes: 15 additions & 0 deletions charts/DemoUI/inference/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: v2
name: inference
description: A Helm chart for chainlit
type: application
version: 0.1.0
appVersion: "0.1.0"
sources:
- https://github.com/Azure/kaito
maintainers:
- name: ishaansehgal99
email: [email protected]
- name: Fei-Guo
email: [email protected]
- name: helayoty
email: [email protected]
44 changes: 44 additions & 0 deletions charts/DemoUI/inference/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# KAITO Demo Frontend Helm Chart
## Install
Before deploying the Demo front-end, you must set the `workspaceServiceURL` environment variable to point to your Workspace Service inference endpoint.

To set this value, modify the `values.override.yaml` file or use the `--set` flag during Helm install/upgrade:

```bash
helm install inference-frontend ./charts/DemoUI/inference/values.yaml --set env.workspaceServiceURL="http://<CLUSTER_IP>:80/chat"
```

Or through a custom `values` file (`values.override.yaml`):
```bash
helm install inference-frontend ./charts/DemoUI/inference/values.yaml -f values.override.yaml
```

## Values

| Key | Type | Default | Description |
|-------------------------------|--------|-------------------------|-------------------------------------------------------|
| `replicaCount` | int | `1` | Number of replicas |
| `image.repository` | string | `"python"` | Image repository |
| `image.pullPolicy` | string | `"IfNotPresent"` | Image pull policy |
| `image.tag` | string | `"3.8"` | Image tag |
| `imagePullSecrets` | list | `[]` | Specify image pull secrets |
| `podAnnotations` | object | `{}` | Annotations to add to the pod |
| `serviceAccount.create` | bool | `false` | Specifies whether a service account should be created |
| `serviceAccount.name` | string | `""` | The name of the service account to use |
| `service.type` | string | `"ClusterIP"` | Service type |
| `service.port` | int | `8000` | Service port |
| `env.workspaceServiceURL` | string | `"<YOUR_SERVICE_URL>"` | Workspace Service URL for the inference endpoint |
| `resources.limits.cpu` | string | `"500m"` | CPU limit |
| `resources.limits.memory` | string | `"256Mi"` | Memory limit |
| `resources.requests.cpu` | string | `"10m"` | CPU request |
| `resources.requests.memory` | string | `"128Mi"` | Memory request |
| `livenessProbe.exec.command` | list | `["pgrep", "chainlit"]` | Command for liveness probe |
| `readinessProbe.exec.command` | list | `["pgrep", "chainlit"]` | Command for readiness probe |
| `nodeSelector` | object | `{}` | Node labels for pod assignment |
| `tolerations` | list | `[]` | Tolerations for pod assignment |
| `affinity` | object | `{}` | Affinity for pod assignment |
| `ingress.enabled` | bool | `false` | Enable or disable ingress |

### Liveness and Readiness Probes

The `livenessProbe` and `readinessProbe` are configured to check if the Chainlit application is running by using `pgrep` to find the process. Adjust these probes as necessary for your deployment.
22 changes: 22 additions & 0 deletions charts/DemoUI/inference/templates/NOTES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Get the application URL by running these commands:
{{- if .Values.ingress.enabled }}
{{- range $host := .Values.ingress.hosts }}
{{- range .paths }}
http{{ if $.Values.ingress.tls }}s{{ end }}://{{ $host.host }}{{ .path }}
{{- end }}
{{- end }}
{{- else if contains "NodePort" .Values.service.type }}
export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "inference.fullname" . }})
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
{{- else if contains "LoadBalancer" .Values.service.type }}
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get --namespace {{ .Release.Namespace }} svc -w {{ include "inference.fullname" . }}'
export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "inference.fullname" . }} --template "{{"{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}"}}")
echo http://$SERVICE_IP:{{ .Values.service.port }}
{{- else if contains "ClusterIP" .Values.service.type }}
export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "app.kubernetes.io/name={{ include "inference.name" . }},app.kubernetes.io/instance={{ .Release.Name }}" -o jsonpath="{.items[0].metadata.name}")
export CONTAINER_PORT=$(kubectl get pod --namespace {{ .Release.Namespace }} $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl --namespace {{ .Release.Namespace }} port-forward $POD_NAME 8080:$CONTAINER_PORT
{{- end }}
62 changes: 62 additions & 0 deletions charts/DemoUI/inference/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "inference.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "inference.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}

{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "inference.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Common labels
*/}}
{{- define "inference.labels" -}}
helm.sh/chart: {{ include "inference.chart" . }}
{{ include "inference.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

{{/*
Selector labels
*/}}
{{- define "inference.selectorLabels" -}}
app.kubernetes.io/name: {{ include "inference.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
Create the name of the service account to use
*/}}
{{- define "inference.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "inference.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
77 changes: 77 additions & 0 deletions charts/DemoUI/inference/templates/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "inference.fullname" . }}
labels:
{{- include "inference.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "inference.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "inference.labels" . | nindent 8 }}
{{- with .Values.podLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "inference.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Chart.Name }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
command: ["/bin/sh"]
args:
- -c
- |
mkdir -p /app/frontend && \
pip install chainlit requests && \
wget -O /app/frontend/inference.py https://raw.githubusercontent.com/Azure/kaito/main/demo/inferenceUI/chainlit.py && \
chainlit run frontend/inference.py -w
env:
- name: WORKSPACE_SERVICE_URL
value: "{{ .Values.env.workspaceServiceURL }}"
workingDir: /app
ports:
- name: http
containerPort: {{ .Values.service.port }}
protocol: TCP
livenessProbe:
{{- toYaml .Values.livenessProbe | nindent 12 }}
readinessProbe:
{{- toYaml .Values.readinessProbe | nindent 12 }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
{{- with .Values.volumeMounts }}
volumeMounts:
{{- toYaml . | nindent 12 }}
{{- end }}
{{- with .Values.volumes }}
volumes:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
15 changes: 15 additions & 0 deletions charts/DemoUI/inference/templates/service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: v1
kind: Service
metadata:
name: {{ include "inference.fullname" . }}
labels:
{{- include "inference.labels" . | nindent 4 }}
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
targetPort: http
protocol: TCP
name: http
selector:
{{- include "inference.selectorLabels" . | nindent 4 }}
13 changes: 13 additions & 0 deletions charts/DemoUI/inference/templates/serviceaccount.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{{- if .Values.serviceAccount.create -}}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "inference.serviceAccountName" . }}
labels:
{{- include "inference.labels" . | nindent 4 }}
{{- with .Values.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
automountServiceAccountToken: {{ .Values.serviceAccount.automount }}
{{- end }}
47 changes: 47 additions & 0 deletions charts/DemoUI/inference/values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# values.yaml for Chainlit Front-end

replicaCount: 1
image:
repository: python
pullPolicy: IfNotPresent
tag: "3.8"
imagePullSecrets: []
podAnnotations: {}
serviceAccount:
create: false
name: ""
service:
type: ClusterIP
port: 8000
# env:
# Workspace Service URL
# Specify the URL for the Workspace Service inference endpoint. Use the DNS name within the cluster for reliability.
#
# Examples:
# Cluster IP: "http://<CLUSTER_IP>:80/chat"
# DNS name: "http://<SERVICE_NAME>.<NAMESPACE>.svc.cluster.local:80/chat"
# e.g., "http://workspace-falcon-7b.default.svc.cluster.local:80/chat"
#
# workspaceServiceURL: "<YOUR_SERVICE_URL>"
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 10m
memory: 128Mi
livenessProbe:
exec:
command:
- pgrep
- chainlit
readinessProbe:
exec:
command:
- pgrep
- chainlit
nodeSelector: {}
tolerations: []
affinity: {}
ingress:
enabled: false
7 changes: 7 additions & 0 deletions demo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
## Kaito Demos Overview

Welcome to the KAITO demos directory! Here you'll find a collection of demonstration
applications designed to showcase various functionalities and
integrations with the KAITO Workspace. Feel free to explore!

For specific instructions and details, please refer to the README.md file within each demo's directory.
61 changes: 61 additions & 0 deletions demo/inferenceUI/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
## KAITO InferenceUI Demo

The KAITO InferenceUI Demo provides a sample front-end application that demonstrates
how to interface with the KAITO Workspace for inference tasks.
This guide covers deploying the front-end as a Helm chart in a Kubernetes environment
as well as how to run the Python application independently.

### Prerequisites

- A Kubernetes cluster with Helm installed
- Access to the KAITO Workspace Service endpoint

## Deployment with Helm
Deploy the KAITO InferenceUI Demo by setting the
workspaceServiceURL environment variable to your
Workspace Service endpoint.


### Configuring the Workspace Service URL
- Using the --set flag:

```
helm install inference-frontend ./charts/DemoUI/inference --set env.workspaceServiceURL="http://<SERVICE_NAME>.<NAMESPACE>.svc.cluster.local:80/chat"
```
- Using a custom `values.override.yaml` file:
```
env:
workspaceServiceURL: "http://<SERVICE_NAME>.<NAMESPACE>.svc.cluster.local:80/chat"
```
Then deploy with custom values file:
```
helm install inference-frontend ./charts/DemoUI/inference -f ./charts/DemoUI/inference/values.override.yaml
```

Replace `<SERVICE_NAME>` and `<NAMESPACE>` with your service's name and Kubernetes namespace.
This DNS naming convention ensures reliable service resolution within your cluster.

## Accessing the Application
After deploying, access the KAITO InferenceUI based on your service type:
- NodePort
```
export NODE_PORT=$(kubectl get --namespace default -o jsonpath="{.spec.ports[0].nodePort}" services inference-frontend)
export NODE_IP=$(kubectl get nodes --namespace default -o jsonpath="{.items[0].status.addresses[0].address}")
echo "Access your application at http://$NODE_IP:$NODE_PORT"
```
- LoadBalancer (It may take a few minutes for the LoadBalancer IP to be available):
```
export SERVICE_IP=$(kubectl get svc --namespace default inference-frontend --template "{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}")
echo "Access your application at http://$SERVICE_IP:8000"
```
- ClusterIP (Use port-forwarding to access your application locally):
```
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=inference" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace default port-forward $POD_NAME 8080:8000
echo "Visit http://127.0.0.1:8080 to use your application"
```

---

For additional support or to report issues, please contact the development
team at <[email protected]>.
Loading

0 comments on commit 05fb90c

Please sign in to comment.