End to end data pipeline
☆22Apr 13, 2025Updated last year
Alternatives and similar repositories for datasystem
Users that are interested in datasystem are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Building a modern data warehouse with SQL server, including ETL processes, data modeling, and analytics.☆28May 3, 2025Updated last year
- Visits sessionization pipeline used for the talk☆13May 28, 2024Updated 2 years ago
- This repo contains examples of high throughput ingestion using Apache Spark and Apache Iceberg. These examples cover IoT and CDC scenario…☆28Mar 17, 2026Updated 2 months ago
- ☆22May 20, 2024Updated 2 years ago
- ☆17Apr 1, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Generate and Compare Debezium CDC (Chance Data Capture) Avro Schema, directly from your Database.☆27Updated this week
- A guide to creating your own domain specific AI powered Knowledge Base.☆24Nov 6, 2023Updated 2 years ago
- This repo provides the Kubernetes Helm chart for deploying Pyspark Notebook.☆17Nov 16, 2022Updated 3 years ago
- A lightweight MVT (Mapbox Vector Tile) tileserver for DuckDB with the duckdb-spatial extension☆75Dec 10, 2025Updated 6 months ago
- Road corrections and measurements from ALS data☆24Nov 15, 2024Updated last year
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆13May 24, 2024Updated 2 years ago
- ☆21Apr 2, 2025Updated last year
- ☆12Dec 17, 2024Updated last year
- ☆17Apr 2, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Build a Full stack Q&A Chatbot with Langchain, and LLM Models on Amazon Sagemaker☆12Nov 10, 2023Updated 2 years ago
- Sample code to collect Apache Iceberg metrics for table monitoring☆29Aug 18, 2024Updated last year
- ☆32Jan 30, 2026Updated 4 months ago
- This repository will consist of advanced RAG applications.☆35Jul 31, 2024Updated last year
- The project XOS or Experimental Operating System is a platform to help in developing a toy operating system.☆66Apr 11, 2018Updated 8 years ago
- Vegetation trait retrieval with remote sensing data in Google Earth Engine and openEO☆36Apr 22, 2026Updated last month
- ☆15Oct 10, 2025Updated 8 months ago
- ☆193May 21, 2025Updated last year
- Pyspark boilerplate for running prod ready data pipeline☆29Mar 17, 2021Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Python package for running hydrological models☆39Feb 19, 2026Updated 3 months ago
- Command-Line Interface (CLI) application for efficient and scalable generation of large-scale 3D building models.☆40Mar 9, 2025Updated last year
- The Cloud Squad is a web application that helps users assess their AWS certification exam readiness through a 30-minute test with real-ti…☆14Mar 24, 2025Updated last year
- Personal notes for the Azure Data Science exam DP-100☆19Mar 20, 2020Updated 6 years ago
- This repository is a collection of configuration files and settings that will be used to customize and set up the Neovim text editor☆17Aug 27, 2025Updated 9 months ago
- An Obsidian plugin for multidimensional note navigation☆28Apr 29, 2024Updated 2 years ago
- NLP chatbot project utilizing the entire SEP encyclopedia as RAG☆28Jan 7, 2026Updated 5 months ago
- The official repository for The Scrappy Project's DS and AI course☆39Aug 22, 2024Updated last year
- Analytics engineering with dbt - projects and developer environment☆22Sep 27, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- AWS Big Data Certification☆25Mar 26, 2026Updated 2 months ago
- Project makes use of LangChain and FastAPI - Focus and Async integration with Vectorstore☆58Feb 7, 2024Updated 2 years ago
- Generate 3D Models of Urban Areas.☆55Jul 6, 2021Updated 4 years ago
- ☆22Nov 25, 2024Updated last year
- ☆39Apr 3, 2025Updated last year
- A Python package to submit and manage Apache Spark applications on Kubernetes.☆46Feb 27, 2026Updated 3 months ago
- There are many like it, but this one is mine.☆41Updated this week