nabihach / FD_CFD_extraction
Code to extract functional dependencies (FDs) and conditional functional dependencies (CFDs) from data
☆36Updated 4 years ago
Alternatives and similar repositories for FD_CFD_extraction:
Users that are interested in FD_CFD_extraction are comparing it to the libraries listed below
- ☆18Updated 5 years ago
- Source code for several Metanome data profiling algorithms☆53Updated last year
- A Generalized Data Cleaning System☆49Updated 8 years ago
- Learn2Clean: Optimizing the Sequence of Tasks for Data Preparation and Cleaning☆51Updated 2 years ago
- Project overview and links to various resources☆19Updated 3 years ago
- Code for extracting, parsing and annotating tables from GitTables (https://gittables.github.io).☆43Updated 3 years ago
- A tool facilitating matching for any dataset discovery method. Also, an extensible experiment suite for state-of-the-art schema matching …☆88Updated 3 weeks ago
- Implementation of TANE for experimental purposes☆12Updated 2 years ago
- Code and data for Sato https://arxiv.org/abs/1911.06311.☆112Updated last year
- Benchmark Datasets for Set Similarity Search☆12Updated 6 years ago
- A Jupyter notebook extension to centralize and manage data☆14Updated 2 years ago
- ☆27Updated 6 years ago
- A Benchmark for Joint Data Cleaning and Machine Learning☆47Updated 10 months ago
- ☆15Updated 2 years ago
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies☆19Updated 2 years ago
- Jenga is an experimentation library that allows data science practititioners and researchers to study the effect of common data corruptio…☆40Updated last year
- A python tool using XGboost and sentence-transformers to perform schema matching task on tables.☆32Updated 2 months ago
- Sketch and LSH Index library for Java, including OPH methods as well as the Lazo method☆13Updated last year
- ☆11Updated last year
- Python library that classifies content from scientific papers with the topics of the Computer Science Ontology (CSO).☆89Updated 4 months ago
- FlexMatcher is a schema matching package in Python which handles the problem of matching multiple schemas to a single mediated schema.☆29Updated 4 months ago
- T2K Match is a matching algorithm optimised to match millions of web tables to a central knowledge base.☆21Updated 6 years ago
- A new framework to generate interpretable classification rules☆17Updated 2 years ago
- Dataset search engine, discovering data from a variety of sources, profiling it, and allowing advanced queries on the index☆43Updated last year
- An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.☆76Updated this week
- A Python-to-SQL transpiler as replacement for Python Pandas☆48Updated 2 years ago
- The Python-JGraphT library☆24Updated 7 months ago
- JedAI-WebApp is a GUI that facilitates the execution of JedAI. JedAI is an open source, high scalability toolkit that offers out-of-the-b…☆23Updated 2 years ago
- ☆11Updated 7 years ago
- Generating Realistic Synthetic Data☆34Updated last year