mitdbg / aurum-datadiscoveryLinks
☆78Updated 2 years ago
Alternatives and similar repositories for aurum-datadiscovery
Users that are interested in aurum-datadiscovery are comparing it to the libraries listed below
Sorting:
- A Machine Learning System for Data Enrichment.☆76Updated 7 years ago
- SparkER: an Entity Resolution framework for Apache Spark☆65Updated last year
- A Generalized Data Cleaning System☆51Updated 9 years ago
- Source code for several Metanome data profiling algorithms☆59Updated 2 years ago
- ☆193Updated last year
- An open-source, vendor-neutral data context service.☆161Updated 7 years ago
- FlorDB 🌻☆158Updated 3 months ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated 2 years ago
- Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.☆346Updated 3 weeks ago
- An open source, high scalability toolkit in Java for Entity Resolution.☆222Updated 7 months ago
- A Jupyter notebook extension to centralize and manage data☆15Updated 3 years ago
- Collection of some algorithms for entity resolution☆28Updated 10 years ago
- A Python wrapper over the GraphGen system☆37Updated 8 years ago
- Data ingestion library for Amundsen to build graph and search index☆204Updated last year
- PySpark phonetic and string matching algorithms☆41Updated last year
- Interactive-Speed Analytics: 200x Faster, 200x Fewer Cluster Resources, Approximate Query Processing☆253Updated 5 years ago
- Applications and APIs from Oracle Graph☆53Updated last month
- Distributed Temporal Graph Analytics with Apache Flink☆252Updated last month
- ☆11Updated 8 years ago
- Project overview and links to various resources☆20Updated 4 years ago
- Code and Benchmarks for JOSIE (SIGMOD 2019)☆19Updated 2 years ago
- A comprehensive and scalable set of string tokenizers and similarity measures in Python☆142Updated last year
- A tool facilitating matching for any dataset discovery method. Also, an extensible experiment suite for state-of-the-art schema matching …☆103Updated 3 months ago
- Create HTML profiling reports from Apache Spark DataFrames☆197Updated 6 years ago
- ☆108Updated 3 years ago
- WInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing,…☆113Updated 3 years ago
- Asynchronous actions for PySpark☆48Updated 4 years ago
- Materials for Apache Arrow workshop at VLDB 2019☆42Updated 5 years ago
- Metadata service library for Amundsen☆82Updated 3 weeks ago
- zenvisage's foundational framework☆70Updated 3 years ago