canimus / cualleeView external linksLinks
Possibly the fastest DataFrame-agnostic quality check library in town.
☆236Feb 5, 2026Updated last week
Alternatives and similar repositories for cuallee
Users that are interested in cuallee are comparing it to the libraries listed below
Sorting:
- ☆15Dec 11, 2023Updated 2 years ago
- Turning PySpark Into a Universal DataFrame API☆485Feb 4, 2026Updated last week
- The smallest DuckDB SQL orchestrator on Earth.☆336Nov 22, 2025Updated 2 months ago
- Lightweight and extensible compatibility layer between dataframe libraries!☆1,519Updated this week
- Cost Efficient Data Pipelines with DuckDB☆61May 14, 2025Updated 9 months ago
- Feature engineering library that helps you keep track of feature dependencies, documentation and schema☆28Jan 21, 2022Updated 4 years ago
- A custom end-to-end analytics platform for customer churn☆11May 15, 2025Updated 9 months ago
- A high-performance data streaming system using DuckDB and Apache Arrow Flight.☆96Feb 22, 2025Updated 11 months ago
- ☆30Dec 4, 2024Updated last year
- pyspark methods to enhance developer productivity 📣 👯 🎉☆682Mar 6, 2025Updated 11 months ago
- 🏃♀️ Minimalist SQL orchestrator☆302Updated this week
- Delta Lake helper methods in PySpark☆327Jan 19, 2026Updated 3 weeks ago
- An implementation of Measures in SQL as a DuckDB extension☆39Jan 29, 2026Updated 2 weeks ago
- Scalable and efficient data transformation framework - backwards compatible with dbt.☆2,891Updated this week
- A repository of blogs/videos that presents how Apache Iceberg is being used in Production by various orgs☆18Jul 31, 2023Updated 2 years ago
- Sentiment and language detection for text analytics.☆17Jul 3, 2024Updated last year
- A data modelling layer built on top of polars and pydantic☆603Feb 4, 2026Updated last week
- Primary repository for NYC DCP's Data Engineering team☆33Updated this week
- 🏁 A sweet and speedy code generator for dbt 🏎️✨☆32Jan 23, 2026Updated 3 weeks ago
- Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and…☆2,397Updated this week
- Python API for Deequ☆810Jan 21, 2026Updated 3 weeks ago
- A Python Library to support running data quality rules while the spark job is running⚡☆197Updated this week
- ☆22Nov 30, 2022Updated 3 years ago
- ☆26Nov 14, 2024Updated last year
- The best Python package for comparing two dataframes☆11Dec 29, 2021Updated 4 years ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆279Oct 7, 2025Updated 4 months ago
- a convenient way to anonymize your data for analytics☆22Nov 7, 2021Updated 4 years ago
- PySpark test helper methods with beautiful error messages☆752Jan 13, 2026Updated last month
- Code to demonstrate data engineering metadata & logging best practices☆20Mar 12, 2024Updated last year
- Minimal plugin loading package for polars with optional type stub generation☆20Jan 29, 2026Updated 2 weeks ago
- Library for conditional Gaussian mixture models, compatible with scikit-learn.☆38Oct 1, 2025Updated 4 months ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,136Feb 5, 2026Updated last week
- [Project moved] Polars integration for Dagster☆35Apr 17, 2025Updated 9 months ago
- A native Rust library for Delta Lake, with bindings into Python☆3,135Updated this week
- Beautifully colored, quick and simple Python logging☆42May 22, 2021Updated 4 years ago
- Polars plugin offering eXtra stuff for DateTimes☆232Dec 5, 2025Updated 2 months ago
- Local development environment for python data projects, with Docker☆23Dec 14, 2022Updated 3 years ago
- ☆31Dec 15, 2023Updated 2 years ago
- DuckDB for streaming data☆748Sep 4, 2025Updated 5 months ago