mitdbg / lazoLinks
Sketch and LSH Index library for Java, including OPH methods as well as the Lazo method
☆15Updated 2 years ago
Alternatives and similar repositories for lazo
Users that are interested in lazo are comparing it to the libraries listed below
Sorting:
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies☆21Updated 3 years ago
- Graph Engine for Exploration and Search☆42Updated 2 years ago
- FlexMatcher is a schema matching package in Python which handles the problem of matching multiple schemas to a single mediated schema.☆29Updated last year
- A comprehensive and scalable set of string tokenizers and similarity measures in Python☆142Updated last year
- A Cython implementation of the affine gap string distance☆57Updated 3 years ago
- Project overview and links to various resources☆20Updated 4 years ago
- Scalable String Similarity Joins in Python☆39Updated last year
- ☆11Updated 2 years ago
- Record Linkage ToolKit (Find and link entities)☆111Updated 2 years ago
- It has never been easier to transform your RDF data into a property graph based on TinkerPop-Gremlin.☆25Updated 5 years ago
- ☆78Updated 2 years ago
- Apache datasketches☆39Updated last week
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the same…☆30Updated 3 years ago
- Resources for tackling record linkage / deduplication / data matching problems☆126Updated last year
- Distributed Bayesian Entity Resolution in Apache Spark☆59Updated 4 years ago
- SparkER: an Entity Resolution framework for Apache Spark☆65Updated last year
- Entity linking, entity typing and relation extraction: Matching CSV to a Wikibase instance (e.g., Wikidata) via Meta-lookup☆70Updated 7 months ago
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆161Updated 3 years ago
- A maximum-strength name parser for record linkage.☆39Updated 4 months ago
- Mirror from: https://gitlab.com/ViDA-NYU/auctus/auctus☆44Updated 8 months ago
- ☆17Updated 10 years ago
- A tool facilitating matching for any dataset discovery method. Also, an extensible experiment suite for state-of-the-art schema matching …☆102Updated 3 months ago
- Algorithms for "schema matching"☆26Updated 9 years ago
- Python package for deduplication/entity resolution using active learning☆83Updated last year
- This repository provides data and scripts to use Sherlock, a DL-based model for semantic data type detection: https://sherlock.media.mit.…☆180Updated last year
- ☆70Updated 3 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated 2 weeks ago
- ☆193Updated last year
- A Fuzzy Matching Approach for Clustering Strings☆30Updated 2 years ago
- Fork of the Freely Extensible Biomedical Record Linkage program☆25Updated 9 years ago