iliaschalkidis / flash-robertaLinks

Hugging Face RoBERTa with Flash Attention 2

☆24

Alternatives and similar repositories for flash-roberta

Users that are interested in flash-roberta are comparing it to the libraries listed below

Sorting:

oriram / spider
☆54Updated 2 years ago
nreimers / se-pytorch-xla
☆21Updated 4 years ago
martiansideofthemoon / longeval-summarization
Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…
☆44Updated last year
unicamp-dl / ExaRanker
☆29Updated last year
castorini / mr.tydi
Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.
☆79Updated 3 years ago
jungokasai / beam_with_patience
☆46Updated 3 years ago
ielab / Starbucks
Starbucks: Improved Training for 2D Matryoshka Embeddings
☆22Updated 5 months ago
huggingface / olm-training
Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.
☆96Updated 2 years ago
google-research / t5x_retrieval
☆101Updated 2 years ago
ivanmontero / autobot
Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'
☆17Updated 3 years ago
guilhermemr04 / scaling-zero-shot-retrieval
No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval
☆29Updated 3 years ago
thakur-nandan / sprint
SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.
☆47Updated 2 years ago
castorini / dhr
Dense hybrid representations for text retrieval
☆63Updated 2 years ago
thakur-nandan / income
INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.
☆24Updated 2 years ago
Tomiinek / Aargh
☆12Updated last year
google-research-datasets / swim-ir
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…
☆49Updated 2 years ago
JetRunner / LaPraDoR
🦮 Code and pretrained models for Findings of ACL 2022 paper "LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrie…
☆49Updated 3 years ago
allenai / EmbeddingRecycling
Embedding Recycling for Language models
☆38Updated 2 years ago
martiansideofthemoon / rankgen
Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…
☆138Updated 2 years ago
frankxu2004 / knnlm-why
Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"
☆59Updated 2 years ago
asahi417 / relbert
The official implementation of "Distilling Relation Embeddings from Pre-trained Language Models, EMNLP 2021 main conference", a high-qual…
☆47Updated 11 months ago
terrierteam / pyterrier_doc2query
☆37Updated 3 weeks ago
bigscience-workshop / multilingual-modeling
BLOOM+1: Adapting BLOOM model to support a new unseen language
☆74Updated last year
juletx / self-translate
Do Multilingual Language Models Think Better in English?
☆42Updated 2 years ago
gsarti / t5-flax-gcp
Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP
☆58Updated 3 years ago
orevaahia / magnet-tokenization
☆13Updated 11 months ago
adapter-hub / hgiyt
Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"
☆27Updated 4 years ago
sebastian-hofstaetter / colberter
☆46Updated 3 years ago
facebookresearch / romqa
A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering
☆17Updated 2 years ago
Clyde013 / Paraphrase-OPT
Observe the slow deterioration of my mental sanity in the github commit history
☆12Updated 2 years ago