Monitor the stability of a Pandas or Spark dataframe ⚙︎
☆511Jan 9, 2026Updated 2 months ago
Alternatives and similar repositories for popmon
Users that are interested in popmon are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A validated, reasonably fast, and easily extensible implementation of a Cox model in PyTorch☆13Feb 4, 2021Updated 5 years ago
- PyCodeHash is a generic data and code hashing library that facilitates downstream caching.☆13Jan 26, 2026Updated 2 months ago
- Dedicated Kafka Connector to track changes in MLflow Model Registry☆10Jan 8, 2021Updated 5 years ago
- Spark Monitoring☆13Feb 28, 2023Updated 3 years ago
- 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.☆13,452Mar 3, 2026Updated 3 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Type System for Data Analysis in Python☆217Feb 1, 2025Updated last year
- SHAP-based validation for linear and tree-based models. Applied to binary, multiclass and regression problems.☆152Apr 19, 2025Updated 11 months ago
- Record matching and entity resolution at scale in Spark☆36Oct 31, 2023Updated 2 years ago
- Ordeq simplifies IO and modularizes pipeline logic.☆41Dec 19, 2025Updated 3 months ago
- Python implementation of Histogrammar, a package for creating histograms with Numpy, Pandas and Spark.☆36Sep 2, 2025Updated 6 months ago
- Extra blocks for scikit-learn pipelines.☆1,386Updated this week
- Always know what to expect from your data.☆11,301Updated this week
- Feature engineering and selection open-source Python library compatible with sklearn.☆2,219Mar 9, 2026Updated 3 weeks ago
- A light-weight, flexible, and expressive statistical data testing library☆4,271Updated this week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- nannyml: post-deployment data science in python☆2,134Jul 12, 2025Updated 8 months ago
- An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model perf…☆2,804Jan 10, 2025Updated last year
- Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and…☆10,799Updated this week
- Visualize and compare datasets, target values and associations, with one line of code.☆3,086Aug 6, 2024Updated last year
- Machine learning with dataframes☆1,586Updated this week
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆92Mar 11, 2026Updated 2 weeks ago
- A simple wrapper to use Pandas Profiling easily in Kedro☆17Apr 12, 2021Updated 4 years ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,538Dec 2, 2024Updated last year
- re_data - fix data issues before your users & CEO would discover them 😊☆1,569Apr 30, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Algorithms for outlier, adversarial and drift detection☆2,505Dec 11, 2025Updated 3 months ago
- A unified framework for machine learning with time series☆9,666Updated this week
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,142Updated this week
- Build data pipelines, the easy way 🛠️☆4,138Jun 6, 2023Updated 2 years ago
- Python package to accelerate the sparse matrix multiplication and top-n similarity selection☆421Mar 9, 2026Updated 3 weeks ago
- Python package for Model Metric Uncertainty estimation☆16Sep 5, 2024Updated last year
- Kedro Plugin to support running pipelines on Kubernetes using Airflow.☆27Mar 11, 2025Updated last year
- EvalML is an AutoML library written in python.☆846Jan 14, 2026Updated 2 months ago
- Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.☆2,239Jun 27, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Use advanced feature engineering strategies and select best features from your data set with a single line of code. Created by Ram Seshad…☆678Feb 19, 2025Updated last year
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆57Jul 1, 2025Updated 8 months ago
- Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML va…☆3,999Dec 28, 2025Updated 3 months ago
- Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.☆2,480Feb 11, 2026Updated last month
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆83May 10, 2024Updated last year
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the same…☆30Nov 10, 2022Updated 3 years ago
- STUMPY is a powerful and scalable Python library for modern time series analysis☆4,073Mar 22, 2026Updated last week