xinyandai / string-embedLinks
string embed for fast edit distance computation, codes for [Convolutional Embedding for Edit Distance (SIGIR 20)].
☆61Updated 2 years ago
Alternatives and similar repositories for string-embed
Users that are interested in string-embed are comparing it to the libraries listed below
Sorting:
- Learned string similarity for entity names using optimal transport.☆35Updated 4 years ago
- locality sensitive hashing (LSHASH) for Python3☆69Updated last month
- Code for pre-training CharacterBERT models (as well as BERT models).☆34Updated 3 years ago
- This repository contains source code to binarize any real-value word embeddings into binary vectors.☆47Updated 4 years ago
- This project focuses on DeepER, a deep learning framework for entity resolution (record deduplication). It examines how DeepER performs o…☆47Updated 7 years ago
- Repository for performing Blocking using Deep Learning based on the paper "Deep Learning for Blocking in Entity Matching: A Design Space …☆32Updated 2 years ago
- [KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding☆57Updated 4 years ago
- The dataset for the paper "Machamp: A Generalized Entity Matching Benchmark" published in CIKM 2021☆20Updated 3 years ago
- WordMoversEmbeddings(WME) is a simple code for generating the vector representation of sentence/document for text classification and clus…☆81Updated 6 years ago
- Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between…☆31Updated 2 years ago
- Code and data for "TURL: Table Understanding through Representation Learning"☆122Updated 3 years ago
- An Interactive Tool for Scalable and Reproducible Error Analysis.☆107Updated 3 years ago
- Scalable Hierarchical Clustering with Tree Grafting☆28Updated 2 years ago
- Numba-based version of DimmWitted Gibbs sampler☆46Updated 7 years ago
- Framework for weakly supervised deep sequence taggers, focused on named entity recognition☆78Updated 2 years ago
- IR-BERT at TREC 2020: Leveraging BERT for Semantic Search in Background Linking☆14Updated 3 years ago
- Zero-Shot Open Entity Typing as Type-Compatible Grounding, EMNLP'18.☆42Updated 5 years ago
- Converter from UD-trees to BART representation☆36Updated last year
- CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training☆32Updated 2 years ago
- Backtranslations of IMDB movie reviews for Data Augmentation Purposes☆11Updated 6 years ago
- Implementation of SiameseXML (ICML 2021)☆40Updated 2 years ago
- Topic clustering library built on Transformer embeddings and cosine similarity metrics.Compatible with all BERT base transformers from hu…☆43Updated 4 years ago
- Implementation of the paper "Deep Indexed Active Learning for Matching Heterogeneous Entity Representations"☆17Updated 3 years ago
- EMNLP BlackBox NLP 2020: Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples☆23Updated 4 years ago
- Implementation of experiments in paper "Learning from Rules Generalizing Labeled Exemplars" to appear in ICLR2020 (https://openreview.net…☆50Updated 2 years ago
- TREC-COVID results - this is a mirror of data on the TREC website in a more convenient format.☆14Updated 4 years ago
- Code and data for the WSDM '19 paper "Crosslingual Document Embedding as Reduced-Rank Ridge Regression (Cr5)"☆30Updated 5 years ago
- State of the art Semantic Sentence Embeddings☆99Updated 3 years ago
- Source code for our AAAI 2020 paper P-SIF: Document Embeddings using Partition Averaging☆34Updated 5 years ago
- Extremely simple and fast extreme multi-class and multi-label classifiers.☆69Updated 2 months ago