Skip to content

In this project we look to set up Airflow monitoring using ElasticSearch-LogStash-Kibana (ELK stack). We will set up the stack using docker images.

License

Notifications You must be signed in to change notification settings

slatawa/Airflow-Monitoring-ElasticSearch-LogStash-Kibana

Repository files navigation

Airflow-Monitoring-ElasticSearch-LogStash-Kibana

Project Description:

In this project we look to set up Airflow monitoring using ELK stack. We will set up the stack using docker image.Airflow can not write logs directly into ElasticSearch but can read from ElasticSearch. Airflow writes data to local in json format and we use file beat installed on worker node to send data to logstatsh which then transforms the data and then sends it to ES.We configure Kibana to connect to ElasticSearch Instance and we will draw dashboards for monitoring our Airflow instance.

Architecture:

img_1.png

Setting this project locally

Build Docker Image

Open command prompt and move to the directory where you have cloned this repo locally and bring up the docker images using the below command

img.png

Post running above command you can see the docker images up and running. If you don't have Docker Dashboard installed you can also just run docker ps to see the running docker containers.

img.png

The config changes required to send logs to ES via the discussed architecture are there in the ./mnt/airflow/airflow.cfg file

Once the docker images are running , you should be able to access the following links, if any issues in accessing the below links check your firewall exceptions settings.

ElasticSearch

http://localhost:9200/

img.png

Kibana

http://localhost:5601/

img.png

Airflow

http://localhost:8080/

img.png

Install FileBeat

Open terminal to airflow worker and run the below command to download FileBeat

curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.5.2-linux-x86_64.tar.gz

img_1.png

Untar the file

tar xvzf filebeat-7.5.2-linux-x86_64.tar.gz

Go into the new created folder filebeat-7.5.2-linux-x86_64 and place the config file placed in this repo at /misc/filebeat.yml into the current folder. The config file has the settings to listen to newlog files created by Airfow and send them to Logstash server.

Run the below command to change filebeat.yml permissions

chmod go-w ./filebeat.yml

Start filebeat

./filebeat

img.png

Creating Dashboards/Reports on Kibana

Before starting this section , run the two dummy dags from the Airflow UI a few times so that we have some logs to play around with in Kibana.

img.png

Go to Management from right side navigation pane as shown in the screen shot below

img.png

From there select Index Patters -> Create New Index

img.png

Create a new index on your logs as shown in below screenshots

img.png

img.png

Once created you can see the mappings in the index

img.png

Go to Discover page from left hand side and you can see the log event details (this also confirms that our logs are flowing into ES as we are able to see the fresh logs in Kibana which is connected to the ES instance)

img.png

Now that we have index set up let's create a new dashboard which gives us some details on tasks health

img.png

img.png

Created a couple of visualizations for seeing task health(tracking number of failed tasks)

img.png

Final DashBoard

Shows the health of tasks by monitoring the number of failed task runs

img.png

About

In this project we look to set up Airflow monitoring using ElasticSearch-LogStash-Kibana (ELK stack). We will set up the stack using docker images.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published