Monitor the stability of a Pandas or Spark dataframe ⚙︎
☆511Jan 9, 2026Updated 4 months ago
Alternatives and similar repositories for popmon
Users that are interested in popmon are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A validated, reasonably fast, and easily extensible implementation of a Cox model in PyTorch☆13Feb 4, 2021Updated 5 years ago
- PyCodeHash is a generic data and code hashing library that facilitates downstream caching.☆13Jan 26, 2026Updated 3 months ago
- Dedicated Kafka Connector to track changes in MLflow Model Registry☆10Jan 8, 2021Updated 5 years ago
- Spark Monitoring☆13Feb 28, 2023Updated 3 years ago
- 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.☆13,541Apr 22, 2026Updated 2 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Type System for Data Analysis in Python☆217Feb 1, 2025Updated last year
- SHAP-based validation for linear and tree-based models. Applied to binary, multiclass and regression problems.☆152Apr 19, 2025Updated last year
- Ordeq simplifies IO and modularizes pipeline logic.☆41Dec 19, 2025Updated 4 months ago
- Record matching and entity resolution at scale in Spark☆36Oct 31, 2023Updated 2 years ago
- Python implementation of Histogrammar, a package for creating histograms with Numpy, Pandas and Spark.☆36Sep 2, 2025Updated 8 months ago
- Extra blocks for scikit-learn pipelines.☆1,391Updated this week
- Always know what to expect from your data.☆11,467Updated this week
- Feature engineering and selection open-source Python library compatible with sklearn.☆2,234Mar 28, 2026Updated last month
- A light-weight, flexible, and expressive statistical data testing library☆4,327Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- nannyml: post-deployment data science in python☆2,140Jul 12, 2025Updated 9 months ago
- An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model perf…☆2,816Jan 10, 2025Updated last year
- Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and…☆10,860May 1, 2026Updated last week
- ☆10Nov 29, 2019Updated 6 years ago
- Machine learning with dataframes☆1,606Updated this week
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆94Mar 11, 2026Updated last month
- Visualize and compare datasets, target values and associations, with one line of code.☆3,096Apr 11, 2026Updated 3 weeks ago
- A simple wrapper to use Pandas Profiling easily in Kedro☆17Apr 12, 2021Updated 5 years ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,534Dec 2, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- re_data - fix data issues before your users & CEO would discover them 😊☆1,569Apr 30, 2024Updated 2 years ago
- A unified framework for machine learning with time series☆9,757Updated this week
- Algorithms for outlier, adversarial and drift detection☆2,516Dec 11, 2025Updated 4 months ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,156Updated this week
- Build data pipelines, the easy way 🛠️☆4,138Jun 6, 2023Updated 2 years ago
- Python package to accelerate the sparse matrix multiplication and top-n similarity selection☆422Apr 9, 2026Updated last month
- Python package for Model Metric Uncertainty estimation☆17Sep 5, 2024Updated last year
- Kedro Plugin to support running pipelines on Kubernetes using Airflow.☆27Mar 11, 2025Updated last year
- EvalML is an AutoML library written in python.☆847Jan 14, 2026Updated 3 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.☆2,241Jun 27, 2024Updated last year
- Use advanced feature engineering strategies and select best features from your data set with a single line of code. Created by Ram Seshad…☆678Feb 19, 2025Updated last year
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆57Apr 27, 2026Updated last week
- Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.☆2,485Feb 11, 2026Updated 2 months ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆83May 10, 2024Updated last year
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the same…☆30Nov 10, 2022Updated 3 years ago
- Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML va…☆4,012Dec 28, 2025Updated 4 months ago