mikulskibartosz / check-engineView external linksLinks
Data validation library for PySpark 3.0.0
☆33Nov 11, 2022Updated 3 years ago
Alternatives and similar repositories for check-engine
Users that are interested in check-engine are comparing it to the libraries listed below
Sorting:
- ☆10Feb 18, 2021Updated 4 years ago
- Rocksdb state storage implementation for Structured Streaming.☆17Oct 21, 2020Updated 5 years ago
- A toolset to streamline running spark python on EMR☆20Nov 16, 2016Updated 9 years ago
- Terraform plans & commands to provision Azure VMSS and VM from a VM image on demand or from a Jenkins pipeline.☆27Aug 9, 2018Updated 7 years ago
- spark on kubernetes☆104Feb 20, 2023Updated 2 years ago
- Code and examples for O'Reilly's Data Wrangling with Python video course☆28Jun 8, 2016Updated 9 years ago
- Open source stack lakehouse☆25Mar 2, 2024Updated last year
- Materials for O'Reilly DL 4 NLP tutorial (SF 2017)☆25Sep 18, 2017Updated 8 years ago
- ☆10Jun 29, 2021Updated 4 years ago
- Record matching and entity resolution at scale in Spark☆36Oct 31, 2023Updated 2 years ago
- ☆12Oct 18, 2022Updated 3 years ago
- Deploy scikit-learn models to a REST API using Docker☆10May 1, 2023Updated 2 years ago
- Slides and code used in the lectures☆12Aug 12, 2019Updated 6 years ago
- 🥪💾 A sample of data from the `jaffle-shop-generator` that powers the Jaffle Shop spanning one year.☆14Jan 23, 2025Updated last year
- Data Catalog for Databases and Data Warehouses☆36Jan 15, 2024Updated 2 years ago
- Dashboard in Python with automatic updates and email notifications☆10Jun 9, 2022Updated 3 years ago
- Uncovering User Interest from Biased and Noised Watch Time in Video Recommendation. In Recsys23.☆11Jul 18, 2023Updated 2 years ago
- R dashboard as a designer☆10Oct 29, 2015Updated 10 years ago
- Manage Unity Catalog tables with Pydantic Models☆10Mar 5, 2025Updated 11 months ago
- M. Gągolewski, M. Bartoszuk, A. Cena, Przetwarzanie i analiza danych w języku Python, PWN, 2016☆10Jan 16, 2023Updated 3 years ago
- prebuilt configurations for docker-rpm-builder☆11Feb 5, 2021Updated 5 years ago
- ☆10Jan 28, 2025Updated last year
- Scala library for parsing fixed length file format☆13Oct 19, 2021Updated 4 years ago
- ☆13Oct 15, 2021Updated 4 years ago
- The Data Product Specification☆11Jan 28, 2025Updated last year
- Collect and aggregate on spark events for profitz☆10Apr 22, 2022Updated 3 years ago
- Eppo Python SDK☆12Nov 8, 2024Updated last year
- Tracebacks for Humans (in Jupyter notebooks)☆12Dec 30, 2025Updated last month
- Datamallet is a python library which contains several helper functions and module for the common tasks in a typical data science workflow…☆11May 19, 2022Updated 3 years ago
- Deep Learning for Computer Vision Practitioner Bundle examples and excercises☆11Apr 23, 2019Updated 6 years ago
- Github action for running python unit tests☆10Jun 16, 2025Updated 7 months ago
- Playground site for creating/validating data contracts☆11Aug 9, 2025Updated 6 months ago
- This project provides a sample code to implement API GW GraphQL API's (sample code uses python grpahql library Graphene) in Lambda functi…☆10May 3, 2021Updated 4 years ago
- ☆94Aug 15, 2022Updated 3 years ago
- ☆13Nov 14, 2013Updated 12 years ago
- ☆11Nov 17, 2020Updated 5 years ago
- An example of how the LIME algorithm can be used to provide real-world insight into the decision processes of a 'black-box' machine learn…☆15Feb 19, 2019Updated 6 years ago
- An end to end ML project. Using MLflow for experiment tracking and model registry. Prefect for workflow orchestration. S3 for artifacts s…☆12Sep 11, 2022Updated 3 years ago
- A collection of CMake modules to simplify the development of Boost libraries.☆10Apr 16, 2012Updated 13 years ago