Code for Apache Hudi, Apache Iceberg and Delta Lake analysis
☆10Feb 2, 2024Updated 2 years ago
Alternatives and similar repositories for acid-file-formats
Users that are interested in acid-file-formats are comparing it to the libraries listed below
Sorting:
- Don't Panic. This guide will help you when it feels like the end of the world.☆30Feb 7, 2026Updated last month
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆28May 19, 2025Updated 9 months ago
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆29Dec 9, 2024Updated last year
- ☆14Jan 8, 2026Updated 2 months ago
- Beyond Vibe Coding. Code, Planning, Documentation and Product Management agents.☆70Feb 20, 2026Updated 2 weeks ago
- Twitter Bot using a simplified Markov chain implementation☆10Jun 4, 2015Updated 10 years ago
- Repository for the dbt Semantic Layer course☆12Updated this week
- AutoMapper website☆14Aug 19, 2020Updated 5 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated this week
- Python Package to Share/Edit Pandas/Polars DF with web interface!☆11Jun 10, 2025Updated 8 months ago
- Quetty the cutest queue manager <3☆19Updated this week
- The official repository for the Rock the JVM Spark Optimization 2 course☆43Dec 4, 2023Updated 2 years ago
- Code snippets used in demos recorded for the blog.☆38Updated this week
- This project showcases how to integrate the world of DevOps, focusing on Continuous Integration (CI) and Continuous Deployment (CD) with …☆15Dec 27, 2023Updated 2 years ago
- A cross-platform desktop application that records audio and transcribes it to text using OpenAI's Whisper API or compatible services. Pe…☆25Dec 29, 2025Updated 2 months ago
- A collection of useful Azure CosmosDb SDK v3 extensions and utilities, developed as part of Allegro Pay product.☆13Jan 7, 2026Updated 2 months ago
- This solution helps you deploy ETL processes and data storage resources to create an Insurance Lake using Amazon S3 buckets for storage, …☆17Feb 5, 2026Updated last month
- This project sets up a real-time data pipeline utilizing Change Data Capture (CDC) to stream changes from a PostgreSQL database to a Clic…☆12May 9, 2024Updated last year
- Repo for For Medium post 'Combine Blazor WebAssembly Client and Server Logs: Two-way log streaming with NLog and SignalR'☆10Apr 13, 2022Updated 3 years ago
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆13May 24, 2024Updated last year
- DBT and clickhouse test project with dagster☆12Aug 29, 2023Updated 2 years ago
- ☆11Nov 26, 2024Updated last year
- How to customize Tableau authentication using the AWS Athena's JDBC Credentials Provider capabilites.☆14Jun 8, 2020Updated 5 years ago
- ☆13Jan 13, 2022Updated 4 years ago
- A simple thin thread demo to showcase how actors work with forkjoinpool☆10Mar 2, 2018Updated 8 years ago
- Helper for handling PySpark DataFrame partition size 📑🎛️☆12Mar 8, 2024Updated 2 years ago
- A Ruby/Sinatra-based Eye-Fi server☆40Jun 5, 2010Updated 15 years ago
- dbt-databend adapter plugin☆10May 30, 2024Updated last year
- similarity between graph nodes based on local information with PySpark☆10Sep 30, 2022Updated 3 years ago
- An experimental edge key-value database built on top of FoundationDB.☆11Jan 9, 2025Updated last year
- 📓 The Documentation website for the "pH7 Social Dating Builder" Software.☆10May 13, 2023Updated 2 years ago
- A CLI tool for deploying services in AWS Elastic Container Service☆12Oct 12, 2017Updated 8 years ago
- Benchmarking for distributed logs.☆10Dec 1, 2016Updated 9 years ago
- ☆12Jan 8, 2026Updated 2 months ago
- Java Based BitTorrent client☆17Jan 6, 2012Updated 14 years ago
- Emmett WebAPI starter repository with PostgreSQL☆14Updated this week
- Associated blog post - https://tristanrhodes.com/blog/Adventures-in-Algorithmic-Trading-on-the-Runescape-Grand-Exchange☆10Oct 14, 2024Updated last year
- ☆14Jan 25, 2026Updated last month
- Implemented Snapshot Algorithm (Chandy Lamport) widely used in Distributed Systems☆10Oct 10, 2017Updated 8 years ago