facebookresearch / BELA
Bi-encoder entity linking architecture
☆44Updated 4 months ago
Alternatives and similar repositories for BELA:
Users that are interested in BELA are comparing it to the libraries listed below
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.☆102Updated 2 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated last year
- ☆96Updated 2 years ago
- 🛠️ Tools for Transformers compression using PyTorch Lightning ⚡☆81Updated 2 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆45Updated last year
- The autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.☆99Updated last year
- Source code and data for Like a Good Nearest Neighbor☆28Updated 2 weeks ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆40Updated 3 years ago
- PyTorch-IE: State-of-the-art Information Extraction in PyTorch☆77Updated 2 weeks ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆66Updated 2 years ago
- Truly flash T5 realization!☆61Updated 8 months ago
- The official repository for Efficient Long-Text Understanding Using Short-Text Models (Ivgi et al., 2022) paper☆68Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆17Updated 3 months ago
- LTG-Bert☆29Updated last year
- A spaCy custom component that extracts and normalizes temporal expressions☆52Updated last year
- Shared code for training sentence embeddings with Flax / JAX☆27Updated 3 years ago
- ☆51Updated last year
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆51Updated last year
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆77Updated 4 months ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated last year
- Code for "Open Vocabulary Extreme Classification Using Generative Models"☆24Updated 2 years ago
- Our open source implementation of MiniLMv2 (https://aclanthology.org/2021.findings-acl.188)☆60Updated last year
- Ranking of fine-tuned HF models as base models.☆35Updated last year
- SQuARE: Software for question answering research.☆73Updated 7 months ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆73Updated 2 years ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆29Updated 2 years ago
- Official implementation of "GPT or BERT: why not both?"☆46Updated this week
- Experiments on including metadata such as URLs, timestamps, website descriptions and HTML tags during pretraining.☆30Updated last year
- Open source library for few shot NLP☆77Updated last year
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆43Updated last year