Skip to content

Generic Stackdriver alert WebHook handler implemented in Cloud Run

License

Notifications You must be signed in to change notification settings

mchmarny/stackdriver-notification-handler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generic Stackdriver Alert WebHook Handler

This simple Cloud Run service handles Stackdriver notifications triggered by alerting policies and publishes them to Cloud PubSub topic for additional handlers to process downstream (see pubsub-to-bigquery-pump for real-world example of how this service can be used)

What

Single Stackdriver channel (WebHook) which targets single Cloud Run service able to handle one or more Stackdriver Alerting Policies. This service also validates the WebHook token to make sure the notification is from a valid source, and then relays that messages to a PubSub.

Why

  1. Unique event triggers - Stackdriver can be configured with multiple alerting policies to capture many GCP events that are currently not available through other means
  2. Works around channel limits - Stackdriver has a limit of 16 notification channels. This service allows you to create a single channel and route any number of alerting policies though this channel to PubSub where you can use GCF or Cloud Run to process these events.

Notifications

The notification published to PubSub topic will differ in content depending on the type of policy that triggered them (see alert samples). Here is an example of incident alert for metered resource (e.g. PubSub Topic).

{
  "incident": {
    "incident_id": "0.lekp2pr4h14z",
    "resource_id": "",
    "resource_name": "cloudylabs Cloud Pub/Sub Subscription labels {subscription_id=pubsub-to-bigquery-pump-sub}",
    "resource": {
      "type": "pubsub_subscription",
      "labels": { "subscription_id": "pubsub-to-bigquery-pump-sub" }
    },
    "started_at": 1573487005,
    "policy_name": "stackdriver-notifs-policy",
    "condition_name": "num-undelivered-messages",
    "url": "https://app.google.stackdriver.com/incidents/0.lekp2pr4h14z?project=cloudylabs",
    "state": "open",
    "ended_at": null,
    "summary": "Unacked messages for Cloud Pub/Sub Subscription labels 'subscription_id=pubsub-to-bigquery-pump-sub' is above the threshold of 100 with a value of 262.000."
  },
  "version": "1.2"
}

Prerequisites

If you don't have one already, start by creating a new project and configuring Google Cloud SDK. Similarly, if you have not done so already, you will have set up Cloud Run.

Deployment

Configuration

To simplify the following commands we will first capture the project ID and notification token

export PROJECT=$(gcloud config get-value project)
echo "Project: ${PROJECT}"
export NOTIF_TOKEN=$(openssl rand -base64 32)
echo "Token: ${NOTIF_TOKEN}"

PubSub Topic

Create the topic (stackdriver-notifications) where all notifications will be published

gcloud pubsub topics create stackdriver-notifications

IAM Account

Create an IAM account (sd-notif-handler) which will be used to run Cloud Run service.

gcloud iam service-accounts create sd-notif-handler \
  --display-name "stackdriver-notification cloud run service account"

To allow this account to perform the necessary functions we are going to grant it a few roles

gcloud projects add-iam-policy-binding $PROJECT \
  --member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
  --role roles/run.invoker

# TODO: `pubsub.publisher` should be sufficient
gcloud projects add-iam-policy-binding $PROJECT \
  --member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
  --role roles/pubsub.editor

gcloud projects add-iam-policy-binding $PROJECT \
  --member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
  --role roles/logging.logWriter

gcloud projects add-iam-policy-binding $PROJECT \
  --member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
  --role roles/cloudtrace.agent

gcloud projects add-iam-policy-binding $PROJECT \
  --member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
  --role roles/monitoring.metricWriter

Cloud Run Service

Create Cloud Run service that will be used to handle all Stackdriver notifications.

gcloud beta run deploy sd-notif-handler \
  --allow-unauthenticated \
  --image gcr.io/cloudylabs-public/sd-notif-handler:0.1.1 \
  --platform managed \
  --timeout 15m \
  --region us-central1 \
  --set-env-vars "RELEASE=v0.1.1,TOPIC_NAME=stackdriver-notifications,NOTIF_TOKEN=${NOTIF_TOKEN}" \
  --service-account "sd-notif-handler@${PROJECT}.iam.gserviceaccount.com"

Once the service is created, we are also going to add a policy binding

gcloud beta run services add-iam-policy-binding sd-notif-handler \
  --member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
  --role roles/run.invoker

Stackdriver

With the processing service ready, we can now define the Stackdriver channel and one or more policies.

Channel

Stackdriver supports WebHooks to notify remote services about incidents that occur. To set up this first create a WebHooks channel

export SERVICE_URL=$(gcloud beta run services describe sd-notif-handler \
  --region us-central1 --format="value(status.url)")
echo "SERVICE_URL=${SERVICE_URL}"

gcloud alpha monitoring channels create \
  --display-name sd-notif-handler-channel \
  --channel-labels "url=${SERVICE_URL}/v1/notif?token=${NOTIF_TOKEN}" \
  --type webhook_tokenauth \
  --enabled

Policy

To monitor GCP and even AWS resources, you need to create alerting policies that when triggered, will use the above created channel (WebHook) to send notifications to Cloud Run. In the policy/ directory you will find a few sample policies. Here is an example of policy monitoring PubSub topic for un unacknowledged messages (age or number)

---
combiner: OR
conditions:
- conditionThreshold:
    aggregations:
    - alignmentPeriod: 60s
      perSeriesAligner: ALIGN_MEAN
    comparison: COMPARISON_GT
    duration: 60s
    filter: metric.type="pubsub.googleapis.com/subscription/oldest_unacked_message_age"
      resource.type="pubsub_subscription" resource.label."subscription_id"="my-iot-events-pump"
    thresholdValue: 180
    trigger:
      count: 1
  displayName: oldest-unacked-message-age
- conditionThreshold:
    aggregations:
    - alignmentPeriod: 60s
      perSeriesAligner: ALIGN_MEAN
    comparison: COMPARISON_GT
    duration: 60s
    filter: metric.type="pubsub.googleapis.com/subscription/num_undelivered_messages"
      resource.type="pubsub_subscription" resource.label."subscription_id"="my-iot-events-pump"
    thresholdValue: 100
    trigger:
      count: 1
  displayName: num-undelivered-messages
enabled: true

That policy will result in this in Stackdriver

Once you have your policy file defined, you can create a policy and assign it to the above created channel

export CHANNEL_ID=$(gcloud alpha monitoring channels list \
  --filter "displayName='sd-notif-handler-channel'" \
  --format 'value("name")')

gcloud alpha monitoring policies create \
  --display-name sd-notif-handler-policy \
  --notification-channels $CHANNEL_ID \
  --policy-from-file PATH_TO_YOUR_POLICY_FILE.yaml

Cleanup

To cleanup all resources created by this sample execute

export POLICY_ID=$(gcloud alpha monitoring policies list \
  --filter "displayName='sd-notif-handler-policy'" \
  --format 'value("name")')
gcloud alpha monitoring policies delete $POLICY_ID

export CHANNEL_ID=$(gcloud alpha monitoring channels list \
  --filter "displayName='sd-notif-handler-channel'" \
  --format 'value("name")')
gcloud alpha monitoring channels delete $CHANNEL_ID

gcloud pubsub subscriptions delete stackdriver-notifications

gcloud beta run services delete sd-notif-handler \
  --platform managed \
  --region us-central1

gcloud iam service-accounts delete \
  "sd-notif-handler@${PROJECT}.iam.gserviceaccount.com"

Disclaimer

This is my personal project and it does not represent my employer. I take no responsibility for issues caused by this code. I do my best to ensure that everything works, but if something goes wrong, my apologies is all you will get.

About

Generic Stackdriver alert WebHook handler implemented in Cloud Run

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published