Skip to content

OpenFDA BigData Pipeline enables collection, processing, and real-time presentation of data - on adverse drug events from the openFDA database

Notifications You must be signed in to change notification settings

koziolk/openfda-bigdata-pipeline

Repository files navigation

OpenFDA BigData Pipeline

OpenFDA BigData Pipeline enables collection, processing, and real-time presentation of data - on adverse drug events from the openFDA database.

The solution uses Apache Kafka as a message broker, Mongo DB as a document storage, Spring Boot for services and is Dockerized.

Contents

This repository contains the code for the openFDA BigData Pipeline solution

Architecture

Pipeline Architecture

Configuration

The project runs with the default configuration defined in each of services and in pipeline.yml. For more details refer directly to:

Running solution locally in Docker

If you intend to try running project yourself, I have put together a pipeline.yml configuration that can help you get started.

Calling the following command

docker-compose -f pipeline.yml up

will:

  • Start openfda-producer container
  • Start zookeper container
  • Start kafka container
  • Start mongodb container
  • Start openfda-consumer container
  • Start openfda-live-dashboard container which will expose port 8050
  • Start jupyter-notebook container which will expose port 8888

Accessing the application

Once all your Docker containers are up and running you can access openfda-live-dashaboard web dashboard via a browser under the following URL:

http://localhost:8050

In addition, you can access Jupyter Notebook jupyter-notebook via a browser under the following URL:

http://localhost:8888

Example graphs

Top 20 patient reactions reported between 2020-01-01 and 2022-01-01

Top patient reactions

Top 20 patient medical products reported between 2020-01-01 and 2022-01-01

Top medical products

Issues and contribution

Bug reports and pull requests are welcome on GitHub at https://github.com/koziolk/openfda-bigdata-pipeline

About

OpenFDA BigData Pipeline enables collection, processing, and real-time presentation of data - on adverse drug events from the openFDA database

Topics

Resources

Stars

Watchers

Forks