xinyandai / string-embed
string embed for fast edit distance computation, codes for [Convolutional Embedding for Edit Distance (SIGIR 20)].
☆60Updated last year
Related projects ⓘ
Alternatives and complementary repositories for string-embed
- This repository contains source code to binarize any real-value word embeddings into binary vectors.☆46Updated 3 years ago
- Code for pre-training CharacterBERT models (as well as BERT models).☆34Updated 3 years ago
- [KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding☆57Updated 3 years ago
- TransformerDB☆19Updated 3 years ago
- WordMoversEmbeddings(WME) is a simple code for generating the vector representation of sentence/document for text classification and clus…☆81Updated 5 years ago
- Implementation of SiameseXML (ICML 2021)☆40Updated 2 years ago
- PyTorch Implementation of Autoencoding Variational Inference for Topic Models (Srivastava and Sutton 2017)☆38Updated 5 years ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆81Updated last month
- Converter from UD-trees to BART representation☆36Updated 8 months ago
- Framework for weakly supervised deep sequence taggers, focused on named entity recognition☆80Updated last year
- Code and data for the WSDM '19 paper "Crosslingual Document Embedding as Reduced-Rank Ridge Regression (Cr5)"☆30Updated 5 years ago
- HiCAL is a system for efficient high-recall retrieval with an adaptable assessing interface.☆37Updated last year
- Data programming by demonstration for information extraction and span annotation☆35Updated 3 years ago
- Source code for our AAAI 2020 paper P-SIF: Document Embeddings using Partition Averaging☆35Updated 4 years ago
- Learned string similarity for entity names using optimal transport.☆34Updated 3 years ago
- Train transformer-based models.☆28Updated last week
- SUPERT: Unsupervised multi-document summarization evaluation & generation☆91Updated last year
- Hyperparameter Search for AllenNLP☆134Updated 4 years ago
- [WWW 2020] Discriminative Topic Mining via Category-Name Guided Text Embedding☆50Updated 3 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆40Updated 2 years ago
- An opensource TAR framework for experiments and applications☆16Updated 7 months ago
- codebase for the Text-based NP Enrichment (TNE) paper☆19Updated 8 months ago
- ☆73Updated 3 years ago
- CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training☆32Updated 2 years ago
- This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".☆63Updated 4 years ago
- Tool for disambiguating acronyms and abbreviations in text for NLP applications☆20Updated 5 months ago
- A python tool for building large scale Wikipedia-based Information Retrieval datasets☆45Updated 3 years ago
- Model for learning document embeddings along with their uncertainties☆35Updated 11 months ago
- An Interactive Tool for Scalable and Reproducible Error Analysis.☆105Updated 3 years ago
- source code of bison☆26Updated 4 years ago