The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
Updated
Jul 3, 2024 - Python
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Flink CDC is a streaming data integration tool
Category-wide association study (CWAS) (Werling et al., 2018; An et al., 2018)
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Feldera Continuous Analytics Platform
🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack - Reverse ETL & Customer Data Platform (CDP)
An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈
3-months Data Science Bootcamp
📺 Instill Console for 🔮 Instill Core: https://github.com/instill-ai/instill-core
Compiler for streaming data pipelines and data microservices with configurable engines.
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
Cryptocurrency prediction using LSTM (Long Short Term Memory) [ Hugging Face: https://huggingface.co/spaces/qywok/cryptocurrency_prediction ]
Jayvee is a domain-specific language and runtime for automated processing of data pipelines
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
Privacy and Security focused Segment-alternative, in Golang and React
Toolkit for describing data transformation pipelines by compositing simple reusable components.
👷🌇 Set up and build a big data processing pipeline with Apache Spark, 📦 AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflows🥊
Criação de um Data Warehouse (DW) utilizando modelagem dimensional em um esquema estrela.
Add a description, image, and links to the data-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the data-pipeline topic, visit your repo's landing page and select "manage topics."