From 3e960c4a55c952f6a66046b971c6494f299ec9b0 Mon Sep 17 00:00:00 2001 From: Anthony Bocci Date: Sat, 23 Jun 2018 14:31:37 +0200 Subject: [PATCH 1/4] Create the component statuses documentation The component-status.md file is based on the online version at: https://docs.cachethq.io/docs/component-statuses --- docs/component-statuses.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) create mode 100644 docs/component-statuses.md diff --git a/docs/component-statuses.md b/docs/component-statuses.md new file mode 100644 index 000000000000..9df7bcc7947b --- /dev/null +++ b/docs/component-statuses.md @@ -0,0 +1,15 @@ +# Component Statuses + +Unlike Incidents, Cachet starts listing Component statuses from 1. +When creating or updating a component, you'll need to specify a status for it. + +A status can be one of the following: + +Status|Name|Description +------|----|----------- +1|Operational|The component is working +2|Performance issues|The component is experiencing some slowness. +3|Partial Outage|The component may not be working for everybody. This could be a geographical issue for example. +4|Major outage|The component is not working for anybody. + + From 6f6e17626ca00ebf85913b1d03199dcdb71c061f Mon Sep 17 00:00:00 2001 From: Anthony Bocci Date: Sat, 23 Jun 2018 14:51:44 +0200 Subject: [PATCH 2/4] Create documentation on what is a metric --- docs/metrics/index.md | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100644 docs/metrics/index.md diff --git a/docs/metrics/index.md b/docs/metrics/index.md new file mode 100644 index 000000000000..247cb6c3e436 --- /dev/null +++ b/docs/metrics/index.md @@ -0,0 +1,32 @@ +# Metrics + +This guide aims to explain basics about metrics. + +## What are metrics + +When you do monitoring on your services, servers, APIs or others, you can get +raw data. These datas may be a response time to a request, the number of queries +handled in a minute, etc. + +The metrics are these raw datas. Using the [Cachet's API][1] you can send the datas +about what you are monitoring to Cachet. + + +## What can do metrics for you + +Having good metrics to show may be great for customers or partners. + +You have a big webservice that is under pressure? So it's important to have a +short response time. A metric could show to your users that the webservice is +responding fast! +Imagine, you have a metric named "Response time". Every 10 seconds you call your +webservice, and send the response time to the Cachet's API, in the metric. On +your status page you'll be able to see the average response time for a minute +for example. + +Doing so, your users would see that during the last 10 minutes your response +time was worst than previously, and it begins to being better. + + + +[1]: api-documentation.md From 02ec0ad71a3dcc7de13948c1a0940d05c0526d32 Mon Sep 17 00:00:00 2001 From: Anthony Bocci Date: Sat, 23 Jun 2018 15:23:39 +0200 Subject: [PATCH 3/4] Create documentation about metric creation. --- docs/metrics/create-metric.md | 54 +++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) create mode 100644 docs/metrics/create-metric.md diff --git a/docs/metrics/create-metric.md b/docs/metrics/create-metric.md new file mode 100644 index 000000000000..712243021224 --- /dev/null +++ b/docs/metrics/create-metric.md @@ -0,0 +1,54 @@ +# Create a metric + +This documentation will guide you through the metric creation. +You need to know [what is a metric][1]. + +## Filling the form + +Creating a metric is as simple as filling a form. You just need to know what do +the fields mean. + +To access to the metrics creation, follow these steps: + +- Log into your Cachet instance. +- Once on the Dashboard click `Metrics` in the sidebar. +- Click the `Create a metric` button. + +And you are there! You should be able to see the metric form. +Let's explain the fields: + +- `Name`: The name of the metric as it will be shown on the status page. + Example: "API response time". +- `Suffix`: The suffix that will be added in the tooltip when you put your mouse + over the point on the metric. Usually it's the unit of the raw data. Example: + "ms". If you send "42" to the metric, then "42ms" would be show in the + tooltip. +- `Description`: A description of the metric. What is the usage of the metric? + What does it measure? Example: "The average response time of our API". +- `Calculation of metrics`: What computation should be done on your data before + displaying them in the metric? It may be either _Sum_ or _Average_. Example: + _Average_ to compute the average reponse time for a given time. +- `Default view`: The default view of the metric. Viewing the datas of 1 year + ago is not useful, but it's about your preference to see datas of the last + hour, 12 hours, week or month. Example: _Last 12 hours_ because you want to + see the last 12 hours of data by default. It's only the default view, this can + be changed in a select box. +- `Decimal places` The number of decimal of the point that is displayed. If you + are computing the average of something it's almost sure that you'll get an + average with a coma, line 42,424242. Example: 2 to get 42,42 instead of a long + number. +- `How many minutes of threshold between metric points?`: The number of minutes + between the points in the metric. According to your needs it may be 1, 5 or + even 30. It's really up to you. Example: 60 to get one point every hour. +- `Display chart on the status page?`: If checked, this chart will be displayed + on the status page. But it's possible to create the metric and not showing it. +- `Visibility`: Who should be able to see the chart? You have three choices: + - `Visible to authenticated users`: It means that people won't be able to + see it except if they are authenticated. Useful if it's an internal metric. + - `Visible to everybody`: It means that every user, even not authenticated, + will be able to see the chart. + - `Always hidden`: It means that nobody will be able to see the chart. + + + +[1]: index.md From 360f163a884e14837e65358eb1f7cfdbf71c07a7 Mon Sep 17 00:00:00 2001 From: Anthony Bocci Date: Sat, 23 Jun 2018 15:42:36 +0200 Subject: [PATCH 4/4] Create documentation about incidents Explain what is and how to use an incident. --- docs/incidents/index.md | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) create mode 100644 docs/incidents/index.md diff --git a/docs/incidents/index.md b/docs/incidents/index.md new file mode 100644 index 000000000000..1e56a310def6 --- /dev/null +++ b/docs/incidents/index.md @@ -0,0 +1,41 @@ +# Incidents + +An incident is something that should not happen, but that happens anyway. + +## What is exactly an incident + +In your status page you are showing the state of some components. It may be a +server, a database, of whatever you want. +If your database server crashes, it is an incident. + +## Why should I create an incident + +Having a status page is a good thing, being honest with the state of your +components is better. +A status page is not only there to show a green light, it's also there to show +why something bad is happening, and when it will be fixed. + +So, when your component experiences a problem, it's a good practice to create an +incident. + +## How to use the incidents + +When experiencing an incident, it's good to keep being up-to-date with what +happens in the real world. That's why you can use _incident updates_. + +How you manage your incidents is up to you, but if you have no idea you can do +the following: + +1. An incident happens. While a team is working to fix it, a person is creating + an incident. Be clear about what happens. At the same time, set the concerned + component with the right status (_Major Outage_, _Performance issues_ or + other) +2. You identify the origin of the problem, add an _incident update_ to explain + what is the problem, if it's important or not. +3. You think the problem is fixed but are not sure, add an incident update to + explain that. Say it should be fixed, you are watching if everything keeps + being good. +4. If it's not fixed, add an _incident update_ as in the second point because + it's identified bt not fixed. If it's fixed, congratulation! Add an _incident + update_ to explain the details, and say it's definitely fixed. Do not forget + to set the component as _Operational_ again.