data-pipeline

Here are 24 public repositories matching this topic...

kestra-io / kestra

Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.

workflow data pipeline etl workflow-engine scheduler orchestration data-engineering data-integration elt data-pipeline data-quality low-code data-orchestration data-orchestrator reverse-etl

Updated Jul 2, 2024
Java

apache / flink-cdc

Star

Flink CDC is a streaming data integration tool

mysql real-time kafka etl postgresql distributed batch data-integration schema-evolution elt flink cdc data-pipeline change-data-capture paimon

Updated Jul 3, 2024
Java

BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.

real-time big-data high-performance data-lake data-integration flink data-synchronization data-pipeline

Updated Jan 1, 2024
Java

apache / seatunnel-web

Star

SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

real-time offline high-performance apache data-integration sql-engine data-pipeline etl-framework seatunnel

Updated Jun 18, 2024
Java

DataSQRL / sqrl

Star

Compiler for streaming data pipelines and data microservices with configurable engines.

api streaming database event-driven-microservices event-driven data-pipeline

Updated Jul 2, 2024
Java

GetFeedback / kahpp-oss

Star

Kafka Streams made easy with a YAML file

yaml automation kafka pipeline tool stream-processing kafka-streams data-processing data-pipeline stream-processor stream-processing-software

Updated Aug 4, 2023
Java

cognitree / kronos

Star

cron replacement to schedule complex data workflows

scheduler task-scheduler cronjob-scheduler quartz-scheduler data-pipeline java-scheduler workflow-scheduler

Updated Nov 16, 2022
Java

yosra270 / store-data-pipeline

Star

Data pipeline using Apache Kafka, Apache Spark and HDFS

kafka big-data spark hdfs data-pipeline

Updated May 13, 2022
Java

apache / seatunnel-datasource-sdk

Star

SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

real-time offline high-performance apache data-integration sql-engine data-pipeline etl-framework seatunnel

Updated Jun 14, 2023
Java

illuin-tech / data-pipeline

Star

Toolkit for describing data transformation pipelines by compositing simple reusable components.

java etl data-pipeline

Updated Jul 2, 2024
Java

sushovankarmakar / kafka-spark-streaming

Star

An end to end data pipeline with Kafka Spark Streaming Integration

java kafka spark spark-streaming java-8 data-pipeline kafka-spark kafka-spark-streaming

Updated Jun 16, 2022
Java

mbrtargeting / camus

Star

LinkedIn's previous generation Kafka to HDFS pipeline.

kafka hadoop data-engineering hdfs data-pipeline

Updated Mar 12, 2019
Java

JinsYin / datalink

Star

⚡ 数据集成 | DataLink is a lightweight data integration framework build on top of DataX, Spark and Flink

data streaming framework big-data spark integration pipeline etl bigdata batch data-integration data-collection flink cdc data-exchange data-synchronization data-pipeline datalink flink-cdc

Updated Jun 19, 2024
Java

colechristini / dataset-lib

Star

Data-processing and common libraries used in main project, all available under Apache 2.0

java data big-data java-8 data-processing data-pipeline

Updated Feb 27, 2019
Java

mujahidniaz / iot_device_streaming_pipeline_cloudera-kakfa-spark-hbase

Star

Real Time Data Streaming Pipeline

kafka spark impala cloudera hbase data-pipeline streaming-data data-ingestion streaming-pipeline iots

Updated Jan 9, 2020
Java

iShiBin / CS502Capstone

Star

CS502Capstone

scala spark cassandra prediction recommender-systems data-pipeline kafak

Updated Feb 18, 2018
Java

BrahianVT / Data-Pipeline

Star

Data-pipeline

mysql database restapi data-pipeline

Updated Jun 21, 2022
Java

Ashfaqbs / Microservices-Based-Wikimedia-Data-Processing-with-Kafka

Star

Efficiently captures real-time Wikimedia data, like a newsroom for Wikipedia changes. Uses microservices, Kafka, and Spring Boot for reliability and scalability. Ideal for research and analysis.

kafka spring-boot microservice jpa java-8 data-pipeline

Updated Oct 12, 2023
Java

rashmishrm / serverhealth

Star

This is Kafka-Elastic Search pipeline for storing and analyzing server health logs

java kafka data-analysis elastic-search data-pipeline

Updated Jul 18, 2017
Java

ghowkay / realtime-metrics-calculation

Star

Realtime metrics calculation pipeline using kafka, elasticsearch and kibana.

docker elasticsearch kibana docker-compose data-engineering data-pipeline kakfa

Updated Feb 16, 2024
Java

Improve this page

Add a description, image, and links to the data-pipeline topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-pipeline topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-pipeline

Here are 24 public repositories matching this topic...

kestra-io / kestra

apache / flink-cdc

bytedance / bitsail

apache / seatunnel-web

DataSQRL / sqrl

GetFeedback / kahpp-oss

cognitree / kronos

yosra270 / store-data-pipeline

apache / seatunnel-datasource-sdk

illuin-tech / data-pipeline

sushovankarmakar / kafka-spark-streaming

mbrtargeting / camus

JinsYin / datalink

colechristini / dataset-lib

mujahidniaz / iot_device_streaming_pipeline_cloudera-kakfa-spark-hbase

iShiBin / CS502Capstone

BrahianVT / Data-Pipeline

Ashfaqbs / Microservices-Based-Wikimedia-Data-Processing-with-Kafka

rashmishrm / serverhealth

ghowkay / realtime-metrics-calculation

Improve this page

Add this topic to your repo