Delta Lake helper methods. No Spark dependency.
☆22Jan 19, 2026Updated last month
Alternatives and similar repositories for levi
Users that are interested in levi are comparing it to the libraries listed below
Sorting:
- Pandas helper functions☆31Feb 19, 2023Updated 3 years ago
- Speak Slack notifications and process Slack slash commands☆15Dec 20, 2018Updated 7 years ago
- Delta Lake helper methods in PySpark☆327Jan 19, 2026Updated last month
- A Python Library to support running data quality rules while the spark job is running⚡☆200Updated this week
- Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3☆26Sep 10, 2024Updated last year
- Sample code to collect Apache Iceberg metrics for table monitoring☆29Aug 18, 2024Updated last year
- pyspark framework☆25Feb 22, 2022Updated 4 years ago
- Local AWS EMR - A local service that imitates AWS EMR☆27Jul 5, 2023Updated 2 years ago
- PySpark test helper methods with beautiful error messages☆753Updated this week
- ☆13Feb 5, 2026Updated 3 weeks ago
- ☆10Feb 10, 2026Updated 3 weeks ago
- Spark style guide☆272Sep 30, 2024Updated last year
- The Emerging Solutions Toolbox is a collection of solutions created by Snowflake's Solution Innovation Team (SIT) that consists of demos,…☆61Feb 17, 2026Updated 2 weeks ago
- Instant search for and access to many datasets in Pyspark.☆34Oct 6, 2022Updated 3 years ago
- Beyond Vibe Coding. Code, Planning, Documentation and Product Management agents.☆70Feb 20, 2026Updated last week
- A set of tools aimed to bridge Phoenix with the Kotlin Multiplatform world☆10Feb 20, 2022Updated 4 years ago
- Repository for the dbt Semantic Layer course☆11Nov 13, 2025Updated 3 months ago
- Data pipeline example written in Rust with Polars and DataFusion DataFrame package☆41Mar 12, 2023Updated 2 years ago
- Twitter auto account report bot using selenium with python☆12Apr 19, 2024Updated last year
- This project showcases how to integrate the world of DevOps, focusing on Continuous Integration (CI) and Continuous Deployment (CD) with …☆15Dec 27, 2023Updated 2 years ago
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆13May 24, 2024Updated last year
- RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …☆10Nov 3, 2023Updated 2 years ago
- Service to evaluate quality measure and cohort specifications against a target patient data set.☆11Jun 2, 2022Updated 3 years ago
- Lecture topics for the Israeli Node.js community monthly meetups☆12May 30, 2018Updated 7 years ago
- PySpark schema generator☆44Feb 23, 2023Updated 3 years ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆683Mar 6, 2025Updated 11 months ago
- A Python package to submit and manage Apache Spark applications on Kubernetes.☆46Updated this week
- A fastmcp server for open budget project☆13Jan 13, 2026Updated last month
- A Mixture‑of‑Experts Educational Framework for Adaptive Cybersecurity☆20Feb 8, 2026Updated 3 weeks ago
- Data Pipeline that utilizes GCP, Python 3.10, Prefect, and more.☆10Jan 23, 2023Updated 3 years ago
- Flake8 plugin to lint for backwards incompatible database migrations☆12Updated this week
- Trying out the Dataframe Polars library with Delta Lake ... feat Python.☆12Jan 29, 2025Updated last year
- Building a poor man's data lake: Exploring the Power of Polars and Delta Lake☆11Dec 6, 2025Updated 2 months ago
- ☆10Jul 11, 2024Updated last year
- End-to-End ELT data pipeline with Postgres, Airbyte, dbt, Dagster, Snowflake and Metabase☆11Jul 13, 2023Updated 2 years ago
- dbt-databend adapter plugin☆10May 30, 2024Updated last year
- Apache Airflow Best Practices, published by Packt☆51Nov 4, 2024Updated last year
- Loosely coupled observer pattern implementation☆11May 23, 2022Updated 3 years ago
- An example of SparkConnect extension.☆15Mar 5, 2024Updated last year