Monitor the stability of a Pandas or Spark dataframe ⚙︎
☆512Jan 9, 2026Updated 4 months ago
Alternatives and similar repositories for popmon
Users that are interested in popmon are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A validated, reasonably fast, and easily extensible implementation of a Cox model in PyTorch☆13Feb 4, 2021Updated 5 years ago
- PyCodeHash is a generic data and code hashing library that facilitates downstream caching.☆13Jan 26, 2026Updated 4 months ago
- Dedicated Kafka Connector to track changes in MLflow Model Registry☆10Jan 8, 2021Updated 5 years ago
- Spark Monitoring☆14Feb 28, 2023Updated 3 years ago
- 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.☆13,567Apr 22, 2026Updated last month
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Type System for Data Analysis in Python☆219Feb 1, 2025Updated last year
- SHAP-based validation for linear and tree-based models. Applied to binary, multiclass and regression problems.☆153Apr 19, 2025Updated last year
- Ordeq simplifies IO and modularizes pipeline logic.☆41Dec 19, 2025Updated 5 months ago
- Record matching and entity resolution at scale in Spark☆36Oct 31, 2023Updated 2 years ago
- Python implementation of Histogrammar, a package for creating histograms with Numpy, Pandas and Spark.☆36Sep 2, 2025Updated 8 months ago
- Extra blocks for scikit-learn pipelines.☆1,396May 19, 2026Updated last week
- Always know what to expect from your data.☆11,525May 21, 2026Updated last week
- Feature engineering and selection open-source Python library compatible with sklearn.☆2,245Mar 28, 2026Updated 2 months ago
- A light-weight, flexible, and expressive statistical data testing library☆4,356May 21, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- nannyml: post-deployment data science in python☆2,138Jul 12, 2025Updated 10 months ago
- An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model perf…☆2,820Jan 10, 2025Updated last year
- Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and…☆10,868May 22, 2026Updated last week
- Machine learning with dataframes☆1,614May 21, 2026Updated last week
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆95May 18, 2026Updated last week
- Visualize and compare datasets, target values and associations, with one line of code.☆3,104Apr 11, 2026Updated last month
- A simple wrapper to use Pandas Profiling easily in Kedro☆17Apr 12, 2021Updated 5 years ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,536Dec 2, 2024Updated last year
- re_data - fix data issues before your users & CEO would discover them 😊☆1,569Apr 30, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A unified framework for machine learning with time series☆9,772May 20, 2026Updated last week
- Algorithms for outlier, adversarial and drift detection☆2,518Dec 11, 2025Updated 5 months ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,162May 19, 2026Updated last week
- Build data pipelines, the easy way 🛠️☆4,140Jun 6, 2023Updated 2 years ago
- Python package to accelerate the sparse matrix multiplication and top-n similarity selection☆423Apr 9, 2026Updated last month
- Python package for Model Metric Uncertainty estimation☆17Updated this week
- Kedro Plugin to support running pipelines on Kubernetes using Airflow.☆27Mar 11, 2025Updated last year
- EvalML is an AutoML library written in python.☆850Jan 14, 2026Updated 4 months ago
- Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.☆2,241Jun 27, 2024Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Use advanced feature engineering strategies and select best features from your data set with a single line of code. Created by Ram Seshad…☆678Feb 19, 2025Updated last year
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆57Apr 27, 2026Updated last month
- Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.☆2,486Feb 11, 2026Updated 3 months ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆83May 10, 2024Updated 2 years ago
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the same…☆30Nov 10, 2022Updated 3 years ago
- Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML va…☆4,017Dec 28, 2025Updated 5 months ago
- STUMPY is a powerful and scalable Python library for modern time series analysis☆4,096May 15, 2026Updated 2 weeks ago