The Data Pipeline and Analytics Stack is a comprehensive solution designed for processing, storing, and visualizing data. Explore a complete data pipeline with all components seamlessly set up and ready to use
☆18Dec 26, 2023Updated 2 years ago
Alternatives and similar repositories for bigdata-ETL-pipeline
Users that are interested in bigdata-ETL-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A data pipeline moving data from a Relational database system (RDBMS) to a Hadoop file system (HDFS).☆15Jun 3, 2021Updated 5 years ago
- 📚🧪 Traffic Sentinel is a learning-focused POC that explores a scalable IoT architecture using Fog nodes and Apache Flink to process 📷 …☆28Dec 29, 2025Updated 6 months ago
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆21Aug 12, 2025Updated 10 months ago
- KnetBuilder data integration platform for building knowledge graphs. Previously known as ondex.☆16Apr 2, 2026Updated 2 months ago
- Spark, Airflow, Kafka☆24Apr 30, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Spark-based pipeline to extract and parse monthly games from the Lichess database.☆22Sep 22, 2025Updated 9 months ago
- In this project I have built etl pipline which scraps the trending repository based on month,week and day LIVE extract other related info…☆12Sep 9, 2023Updated 2 years ago
- Sample code and documentation for very basic things that I can't remember but want to aggregate in one place☆13Nov 7, 2021Updated 4 years ago
- VSCode extension for working with Architecture As A Code in the C4 model. Includes syntax highlighting, diagram preview, and tools for wo…☆38Jun 5, 2026Updated 3 weeks ago
- Jupyter notebooks for the teaching of mechanics☆11Oct 8, 2024Updated last year
- A simple Perceptron in Python☆10Feb 11, 2022Updated 4 years ago
- Spark and Hive docker containers sharing a common MySQL metastore☆26Apr 17, 2020Updated 6 years ago
- Crawling the data from lazada, websosanh, compare.vn, cdiscount and cungmua with flexible configs☆30Jul 7, 2016Updated 9 years ago
- Natural Language Processing Project☆11Jul 6, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Skribify is a powerful transcription and summarization tool that leverages the power of OpenAI's GPT-4 and WhisperAI to generate concise …☆12Apr 29, 2025Updated last year
- 参考 Chat2DB 的效果,使用 chatgpt 进行自然语言翻译,然后对数据库进行操作,使用 rust 语言实现的 web 应用。☆10Jan 13, 2025Updated last year
- Python implementation of binary max-heaps.☆11Mar 22, 2020Updated 6 years ago
- Doing sql in notebooks.☆15Aug 14, 2023Updated 2 years ago
- ☆25Jun 27, 2025Updated last year
- ☆10Jan 31, 2021Updated 5 years ago
- This project provides an AI-driven test case generator using FastAPI. The application accepts a GitHub repository name and generates test…☆20Jun 7, 2024Updated 2 years ago
- Extension giúp ta tự động điền form Quy chế + Pháp Luật - HUST☆23May 26, 2026Updated last month
- Talk to your database as if you were chatting with a friend. Turn natural language into powerful SQL queries effortlessly, and get your a…☆10Nov 12, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and …☆34Feb 9, 2024Updated 2 years ago
- Hyperaudio Converter - converts from JSON/SRT to HTML Based Interactive Transcript☆14Dec 16, 2020Updated 5 years ago
- Kafka and Spark Integration. Alll code in maven project.☆14Nov 16, 2022Updated 3 years ago
- Web app designed to enhance your interaction with OpenAI's language models☆12Jun 14, 2023Updated 3 years ago
- Companion repository that goes along with Snowflake's "Advanced Data Engineering with Snowflake" course☆37Apr 23, 2025Updated last year
- Sidewall is a Python library for interacting with the Dimensions search API.☆17Sep 11, 2024Updated last year
- ☆16May 27, 2025Updated last year
- This repository serves as a comprehensive guide to effective data modeling and robust data quality assurance using popular open-source to…☆42Sep 20, 2023Updated 2 years ago
- An investment portfolio simulator☆12Oct 15, 2019Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆51May 21, 2026Updated last month
- Create a QnA bot on a pdf☆16May 27, 2023Updated 3 years ago
- Community documents for Teiid Engine and Teiid Server☆14Jan 6, 2021Updated 5 years ago
- MCP Client and Server apps to demo integration of Azure OpenAI-based AI agent with a Data Warehouse, exposed through GraphQL in Microsoft…☆11Jul 6, 2025Updated 11 months ago
- 一个基于 Chatterbox-TTS的文字转语音(TTS)服务。提供与 OpenAI TTS 兼容的 API 接口并支持声音克隆,附带简洁的 Web 用户界面。☆19May 7, 2026Updated last month
- ☆20Feb 12, 2025Updated last year
- Introductory interactive Jupyter tutorial providing details about ORMs in order to assist in the teaching of their use to computing scien…☆14Oct 21, 2025Updated 8 months ago