Raiffeisen-DGTL / checkita-data-qualityLinks
Fast data quality framework for modern data infrastructure
☆29Updated this week
Alternatives and similar repositories for checkita-data-quality
Users that are interested in checkita-data-quality are comparing it to the libraries listed below
Sorting:
- Distributed run of dbt models using Airflow☆168Updated 2 months ago
- DE or DIE meetup made by data engineers for data engineers. Currently in Russian only.☆58Updated 2 years ago
- Docker Compose with Almond.sh core for Jupyter☆18Updated last year
- Python client for MLflow REST API☆36Updated last year
- Data Engineering misc☆14Updated 4 years ago
- ☆12Updated 4 years ago
- One ETL tool to rule them all☆83Updated this week
- ☆408Updated last year
- ☆43Updated 4 years ago
- Репозиторий для открытого курса «Промышленная эксплуатация моделей машинного обучения»☆97Updated 2 years ago
- Practice course on Big Data☆17Updated last year
- Learning resources for Airflow Tutorial article.☆56Updated 5 years ago
- YTsaurus SPYT provides an integration with Apache Spark☆19Updated this week
- Ambrosia is a Python library for A/B tests design, split and result measurement☆239Updated last week
- SparkConnect Server plugin and protobuf messages for the Amazon Deequ Data Quality Engine.☆26Updated 11 months ago
- Allow parsing Russian receipts☆53Updated 5 years ago
- machine learning lifecycle framework☆201Updated 3 years ago
- Бэйслайн к задаче RetailHero.ai/#2 от @geffy 💪☆108Updated 6 years ago
- 100 упражнений по NumPy (версия на русском)☆165Updated last year
- Курс про Apache Airflow 2.0☆36Updated 5 months ago
- Курс "Прикладные задачи анализа данных" (ВМК, МГУ имени М.В. Ломоносова)☆324Updated 3 years ago
- ☆29Updated 3 years ago
- This project is used to capture machine learning pipelines created on top of Spark as OK☆54Updated 3 years ago
- Data Forge — a modern data stack playground to practice flows and best practices, not just tools. Spark, Trino, Kafka, Iceberg, ClickHous…☆168Updated 3 months ago
- ☆22Updated last year
- Module for pipelines concept in PySpark☆16Updated last year
- Data Engineer RoadMap☆35Updated 3 years ago
- Cl app / pre-commit hook to clean Jupyter Notebooks metadata, execution_count and optionally output.☆11Updated 11 months ago
- Home assignments for data science positions☆633Updated 2 years ago
- A database-like benchmark of feature generation from time-series data☆13Updated last year