This simple Cloud Run service handles Stackdriver notifications triggered by alerting policies and publishes them to Cloud PubSub topic for additional handlers to process downstream (see pubsub-to-bigquery-pump for real-world example of how this service can be used)
Single Stackdriver channel (WebHook) which targets single Cloud Run service able to handle one or more Stackdriver Alerting Policies. This service also validates the WebHook token to make sure the notification is from a valid source, and then relays that messages to a PubSub.
- Unique event triggers - Stackdriver can be configured with multiple alerting policies to capture many GCP events that are currently not available through other means
- Works around channel limits - Stackdriver has a limit of 16 notification channels. This service allows you to create a single channel and route any number of alerting policies though this channel to PubSub where you can use GCF or Cloud Run to process these events.
The notification published to PubSub topic will differ in content depending on the type of policy that triggered them (see alert samples). Here is an example of incident alert for metered resource (e.g. PubSub Topic).
{
"incident": {
"incident_id": "0.lekp2pr4h14z",
"resource_id": "",
"resource_name": "cloudylabs Cloud Pub/Sub Subscription labels {subscription_id=pubsub-to-bigquery-pump-sub}",
"resource": {
"type": "pubsub_subscription",
"labels": { "subscription_id": "pubsub-to-bigquery-pump-sub" }
},
"started_at": 1573487005,
"policy_name": "stackdriver-notifs-policy",
"condition_name": "num-undelivered-messages",
"url": "https://app.google.stackdriver.com/incidents/0.lekp2pr4h14z?project=cloudylabs",
"state": "open",
"ended_at": null,
"summary": "Unacked messages for Cloud Pub/Sub Subscription labels 'subscription_id=pubsub-to-bigquery-pump-sub' is above the threshold of 100 with a value of 262.000."
},
"version": "1.2"
}
If you don't have one already, start by creating a new project and configuring Google Cloud SDK. Similarly, if you have not done so already, you will have set up Cloud Run.
To simplify the following commands we will first capture the project ID and notification token
export PROJECT=$(gcloud config get-value project)
echo "Project: ${PROJECT}"
export NOTIF_TOKEN=$(openssl rand -base64 32)
echo "Token: ${NOTIF_TOKEN}"
Create the topic (stackdriver-notifications
) where all notifications will be published
gcloud pubsub topics create stackdriver-notifications
Create an IAM account (sd-notif-handler
) which will be used to run Cloud Run service.
gcloud iam service-accounts create sd-notif-handler \
--display-name "stackdriver-notification cloud run service account"
To allow this account to perform the necessary functions we are going to grant it a few roles
gcloud projects add-iam-policy-binding $PROJECT \
--member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
--role roles/run.invoker
# TODO: `pubsub.publisher` should be sufficient
gcloud projects add-iam-policy-binding $PROJECT \
--member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
--role roles/pubsub.editor
gcloud projects add-iam-policy-binding $PROJECT \
--member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
--role roles/logging.logWriter
gcloud projects add-iam-policy-binding $PROJECT \
--member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
--role roles/cloudtrace.agent
gcloud projects add-iam-policy-binding $PROJECT \
--member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
--role roles/monitoring.metricWriter
Create Cloud Run service that will be used to handle all Stackdriver notifications.
gcloud beta run deploy sd-notif-handler \
--allow-unauthenticated \
--image gcr.io/cloudylabs-public/sd-notif-handler:0.1.1 \
--platform managed \
--timeout 15m \
--region us-central1 \
--set-env-vars "RELEASE=v0.1.1,TOPIC_NAME=stackdriver-notifications,NOTIF_TOKEN=${NOTIF_TOKEN}" \
--service-account "sd-notif-handler@${PROJECT}.iam.gserviceaccount.com"
Once the service is created, we are also going to add a policy binding
gcloud beta run services add-iam-policy-binding sd-notif-handler \
--member "serviceAccount:sd-notif-handler@${PROJECT}.iam.gserviceaccount.com" \
--role roles/run.invoker
With the processing service ready, we can now define the Stackdriver channel and one or more policies.
Stackdriver supports WebHooks to notify remote services about incidents that occur. To set up this first create a WebHooks channel
export SERVICE_URL=$(gcloud beta run services describe sd-notif-handler \
--region us-central1 --format="value(status.url)")
echo "SERVICE_URL=${SERVICE_URL}"
gcloud alpha monitoring channels create \
--display-name sd-notif-handler-channel \
--channel-labels "url=${SERVICE_URL}/v1/notif?token=${NOTIF_TOKEN}" \
--type webhook_tokenauth \
--enabled
To monitor GCP and even AWS resources, you need to create alerting policies that when triggered, will use the above created channel (WebHook) to send notifications to Cloud Run. In the policy/ directory you will find a few sample policies. Here is an example of policy monitoring PubSub topic for un unacknowledged messages (age or number)
---
combiner: OR
conditions:
- conditionThreshold:
aggregations:
- alignmentPeriod: 60s
perSeriesAligner: ALIGN_MEAN
comparison: COMPARISON_GT
duration: 60s
filter: metric.type="pubsub.googleapis.com/subscription/oldest_unacked_message_age"
resource.type="pubsub_subscription" resource.label."subscription_id"="my-iot-events-pump"
thresholdValue: 180
trigger:
count: 1
displayName: oldest-unacked-message-age
- conditionThreshold:
aggregations:
- alignmentPeriod: 60s
perSeriesAligner: ALIGN_MEAN
comparison: COMPARISON_GT
duration: 60s
filter: metric.type="pubsub.googleapis.com/subscription/num_undelivered_messages"
resource.type="pubsub_subscription" resource.label."subscription_id"="my-iot-events-pump"
thresholdValue: 100
trigger:
count: 1
displayName: num-undelivered-messages
enabled: true
That policy will result in this in Stackdriver
Once you have your policy file defined, you can create a policy and assign it to the above created channel
export CHANNEL_ID=$(gcloud alpha monitoring channels list \
--filter "displayName='sd-notif-handler-channel'" \
--format 'value("name")')
gcloud alpha monitoring policies create \
--display-name sd-notif-handler-policy \
--notification-channels $CHANNEL_ID \
--policy-from-file PATH_TO_YOUR_POLICY_FILE.yaml
To cleanup all resources created by this sample execute
export POLICY_ID=$(gcloud alpha monitoring policies list \
--filter "displayName='sd-notif-handler-policy'" \
--format 'value("name")')
gcloud alpha monitoring policies delete $POLICY_ID
export CHANNEL_ID=$(gcloud alpha monitoring channels list \
--filter "displayName='sd-notif-handler-channel'" \
--format 'value("name")')
gcloud alpha monitoring channels delete $CHANNEL_ID
gcloud pubsub subscriptions delete stackdriver-notifications
gcloud beta run services delete sd-notif-handler \
--platform managed \
--region us-central1
gcloud iam service-accounts delete \
"sd-notif-handler@${PROJECT}.iam.gserviceaccount.com"
This is my personal project and it does not represent my employer. I take no responsibility for issues caused by this code. I do my best to ensure that everything works, but if something goes wrong, my apologies is all you will get.