The premier open source Data Quality solution
☆651May 6, 2026Updated last month
Alternatives and similar repositories for DataCleaner
Users that are interested in DataCleaner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Mirror of Apache griffin☆1,170Aug 3, 2025Updated 10 months ago
- Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various data…☆763Apr 2, 2026Updated 2 months ago
- 内嵌AI的数据质量控制系统☆48Sep 29, 2021Updated 4 years ago
- Tutorial and examples of Data Quality in Big Data System☆11Apr 25, 2017Updated 9 years ago
- Mirror of Apache Metamodel☆160Jun 23, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Tool to automate data quality checks on data pipelines☆256Sep 10, 2022Updated 3 years ago
- DataQuality for BigData☆149Dec 15, 2023Updated 2 years ago
- DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitizati…☆3,262Nov 4, 2025Updated 7 months ago
- Spark package for checking data quality☆221Feb 28, 2020Updated 6 years ago
- Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies…☆1,111Jan 12, 2023Updated 3 years ago
- The Context Platform for your Data and AI Stack☆12,041Updated this week
- Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…☆4,771Jun 1, 2026Updated last week
- OpenRefine is a free, open source power tool for working with messy data and improving it☆11,856Updated this week
- 数据治理、数据质量检核/监控平台(Django+jQuery+MySQL)☆186Dec 8, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Apache Atlas - Open Metadata Management and Governance capabilities across the Hadoop platform and beyond☆2,106Jun 2, 2026Updated last week
- ☆1,684May 5, 2026Updated last month
- Moonbox is a DVtaaS (Data Virtualization as a Service) Platform☆505Apr 14, 2023Updated 3 years ago
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,618May 29, 2026Updated last week
- Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.☆741Apr 18, 2026Updated last month
- Model driven data quality service☆240Dec 4, 2017Updated 8 years ago
- Collect, aggregate, and visualize a data ecosystem's metadata☆2,205Jun 2, 2026Updated last week
- Hop Orchestration Platform☆1,390Jun 2, 2026Updated last week
- A library to store metadata of relational databases including the schema, statistics, and integrity constraints.☆25Aug 7, 2018Updated 7 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Egeria core☆911Jun 1, 2026Updated last week
- Pentaho Data Integration ( ETL ) a.k.a Kettle☆8,345Jun 2, 2026Updated last week
- A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, orga…☆2,265Jun 1, 2026Updated last week
- Exchangis is a lightweight,highly extensible data exchange platform that supports data transmission between structured and unstructured h…☆461Oct 28, 2025Updated 7 months ago
- πflow is a big data flow engine with spark support☆542Oct 22, 2025Updated 7 months ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆193Jan 5, 2026Updated 5 months ago
- First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business…☆1,408Updated this week
- A data integration framework☆4,108Dec 2, 2025Updated 6 months ago
- Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, res…☆814Dec 11, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Data Contracts engine for the modern data stack. https://www.soda.io☆2,366Updated this week
- Davinci is a DVsaaS (Data Visualization as a Service) Platform☆5,005Sep 5, 2023Updated 2 years ago
- Kettle Web Integrator - An easy and open way to integrate your web app with Kettle Pentaho Data Integration☆48Nov 27, 2015Updated 10 years ago
- Kettle Online Business Intelligence Platform -- Pentaho Data Integration ( ETL ) a.k.a Kettle☆18May 26, 2017Updated 9 years ago
- An Open Standard for lineage metadata collection☆2,497Updated this week
- Always know what to expect from your data.☆11,548Updated this week
- ☆51Updated this week