The Data Pipeline and Analytics Stack is a comprehensive solution designed for processing, storing, and visualizing data. Explore a complete data pipeline with all components seamlessly set up and ready to use
☆17Dec 26, 2023Updated 2 years ago
Alternatives and similar repositories for bigdata-ETL-pipeline
Users that are interested in bigdata-ETL-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize a…☆39Dec 15, 2025Updated 3 months ago
- This project aims to move the data from a Relational database system (RDBMS) to a Hadoop file system (HDFS)☆11Apr 29, 2022Updated 3 years ago
- A data pipeline moving data from a Relational database system (RDBMS) to a Hadoop file system (HDFS).☆15Jun 3, 2021Updated 4 years ago
- 📚🧪 Traffic Sentinel is a learning-focused POC that explores a scalable IoT architecture using Fog nodes and Apache Flink to process 📷 …☆28Dec 29, 2025Updated 3 months ago
- Data Pipeline that utilizes GCP, Python 3.10, Prefect, and more.☆10Jan 23, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Files for the Docker and Kubernetes on Google Cloud Hands-On labs☆11Mar 14, 2023Updated 3 years ago
- VSCode extension for working with Architecture As A Code in the C4 model. Includes syntax highlighting, diagram preview, and tools for wo…☆35Updated this week
- Spark-based pipeline to extract and parse monthly games from the Lichess database.☆21Sep 22, 2025Updated 6 months ago
- Built a real-time streaming pipeline to extract stock data, using Apache Nifi, Debezium, Kafka, and Spark Streaming. Loaded the transform…☆28Oct 13, 2023Updated 2 years ago
- Codebase for EnterpriseOps-Gym from ServiceNow☆79Mar 25, 2026Updated 2 weeks ago
- End-to-End deployment of E-commerce customers segmentation using Clustering Machine learning algorithms in Google Cloud Platform and MLOp…☆20Jun 5, 2024Updated last year
- Training intrinsically motivated, independent Q-learners to play Tic-Tac-Toe☆11May 12, 2021Updated 4 years ago
- ⚡ FutureGPT - Application development framework that connects GPT-4 with external data, the internet, other applications and language mod…☆13May 14, 2023Updated 2 years ago
- A simple Perceptron in Python☆10Feb 11, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Cloud based Data Platform based on Apache Spark☆27Feb 17, 2026Updated last month
- Spark and Hive docker containers sharing a common MySQL metastore☆26Apr 17, 2020Updated 5 years ago
- Natural Language Processing Project☆11Jul 6, 2021Updated 4 years ago
- Create flowcharts, sequence diagrams and more with mermaid js and AI☆17Jul 16, 2024Updated last year
- Doing sql in notebooks.☆15Aug 14, 2023Updated 2 years ago
- Companion repository that goes along with Snowflake's "Advanced Data Engineering with Snowflake" course☆30Apr 23, 2025Updated 11 months ago
- 🚀 Portfolio: Co-Pilot, 💡 Investing: Idea Generation, 🚦Trade: Due Diligence☆18Updated this week
- ☆25Jun 27, 2025Updated 9 months ago
- This project provides an AI-driven test case generator using FastAPI. The application accepts a GitHub repository name and generates test…☆20Jun 7, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and …☆34Feb 9, 2024Updated 2 years ago
- Zero-dependency Java client for HashiCorp's Vault☆39Dec 14, 2025Updated 3 months ago
- ☆12Mar 17, 2022Updated 4 years ago
- Hyperaudio Converter - converts from JSON/SRT to HTML Based Interactive Transcript☆14Dec 16, 2020Updated 5 years ago
- Kafka and Spark Integration. Alll code in maven project.☆14Nov 16, 2022Updated 3 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆56May 6, 2023Updated 2 years ago
- Web app designed to enhance your interaction with OpenAI's language models☆12Jun 14, 2023Updated 2 years ago
- This repository serves as a comprehensive guide to effective data modeling and robust data quality assurance using popular open-source to…☆40Sep 20, 2023Updated 2 years ago
- ☆16May 27, 2025Updated 10 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Create a QnA bot on a pdf☆16May 27, 2023Updated 2 years ago
- A GitHub Action that analyses a Pull Request and adds unit tests if necessary / possible☆13Apr 29, 2023Updated 2 years ago
- Application of reinforcement learning in trading financial markets.☆20Feb 22, 2021Updated 5 years ago
- 一个基于 Chatterbox-TTS的文字转语音(TTS)服务。提供与 OpenAI TTS 兼容的 API 接口并支持声音克隆,附带简洁的 Web 用户界面。☆19Jan 17, 2026Updated 2 months ago
- Introductory interactive Jupyter tutorial providing details about ORMs in order to assist in the teaching of their use to computing scien…☆14Oct 21, 2025Updated 5 months ago
- ☆13Jul 22, 2024Updated last year
- LLM Agent that performs sentiment analysis of drawings and natural language using a combination of Google Gemini Vision model and GPT-4 T…☆13Dec 22, 2023Updated 2 years ago