cpitclaudel / dBoost
☆16Updated 9 years ago
Alternatives and similar repositories for dBoost:
Users that are interested in dBoost are comparing it to the libraries listed below
- A Generalized Data Cleaning System☆49Updated 8 years ago
- ☆75Updated last year
- The BART Project: Benchmarking Algorithms for (data) Repairing and Translation☆37Updated last year
- A Machine Learning System for Data Enrichment.☆75Updated 6 years ago
- SparkER: an Entity Resolution framework for Apache Spark☆63Updated 9 months ago
- A database with automatic dynamic imputation of missing values.☆10Updated 7 years ago
- Sketch and LSH Index library for Java, including OPH methods as well as the Lazo method☆13Updated last year
- Sketching linear classifiers over data streams with the Weight-Median Sketch (SIGMOD 2018).☆39Updated 6 years ago
- Jenga is an experimentation library that allows data science practititioners and researchers to study the effect of common data corruptio…☆38Updated last year
- Explaining Inference Queries with Bayesian Optimization☆10Updated 3 years ago
- A Python-to-SQL transpiler as replacement for Python Pandas☆48Updated 2 years ago
- Rheem - a cross-platform data processing system☆5Updated 2 years ago
- deep entity resolution lite version☆11Updated 5 years ago
- A Python wrapper over the GraphGen system☆37Updated 7 years ago
- A JSON-based schema for storing declarative descriptions of machine learning experiments☆45Updated 7 years ago
- A polystore database from researchers of the Intel Science and Technology Center for Big Data☆37Updated 2 years ago
- AutoBazaar: An AutoML System from the Machine Learning Bazaar☆33Updated 3 years ago
- Implements the Karnin-Lang-Liberty (KLL) algorithm in python☆54Updated 2 years ago
- A simple tool for plotting Spark ML's Decision Trees☆41Updated 2 years ago
- Implementation of TANE for experimental purposes☆11Updated 2 years ago
- ☆38Updated 8 years ago
- Source code for several Metanome data profiling algorithms☆52Updated last year
- Inspect ML Pipelines in Python in the form of a DAG☆70Updated 10 months ago
- The Data Linter identifies potential issues (lints) in your ML training data.☆87Updated 7 years ago
- Affinity Propagation on Spark☆19Updated 3 years ago
- Condor allows for the specification of synopsis-based streaming jobs on top of general dataflow systems. Condor provides a collection of …☆13Updated 6 months ago
- A tool and library for easily deploying applications on Apache YARN☆142Updated 10 months ago
- Idempotent query executor☆50Updated 2 weeks ago
- ssh code☆12Updated 7 years ago
- Collection of some algorithms for entity resolution☆28Updated 9 years ago