Hugging Face RoBERTa with Flash Attention 2
☆24Sep 14, 2025Updated 5 months ago
Alternatives and similar repositories for flash-roberta
Users that are interested in flash-roberta are comparing it to the libraries listed below
Sorting:
- Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder☆10Mar 16, 2023Updated 2 years ago
- ☆24Jan 30, 2025Updated last year
- ☆13Nov 19, 2022Updated 3 years ago
- Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking☆13Feb 5, 2023Updated 3 years ago
- Evaluate state-of-the-art sparse embedding models on the LIMIT dataset (`limit-small` and `limit`) from google's paper `On the Theoretica…☆15Sep 4, 2025Updated 6 months ago
- Finetune mistral-7b-instruct for sentence embeddings☆88May 2, 2024Updated last year
- Code for embedding and retrieval research.☆16Oct 24, 2023Updated 2 years ago
- ☆43Apr 22, 2025Updated 10 months ago
- [SIGIR24] Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval☆18Feb 29, 2024Updated 2 years ago
- doc-cov is a tool for measuring docstring coverage of Python project.☆12Mar 8, 2019Updated 6 years ago
- Tool for converting LLMs from uni-directional to bi-directional by removing causal mask for tasks like classification and sentence embedd…☆64Dec 12, 2024Updated last year
- The Python Implementation of CRISP: Clustering Multi-Vector Representations for Denoising and Pruning☆27Jul 27, 2025Updated 7 months ago
- ☆21Apr 16, 2024Updated last year
- Multi-Turn-Single-Intent Bert model for dialogue session classification☆25Dec 8, 2022Updated 3 years ago
- presentation slides☆20Feb 27, 2026Updated last week
- Implementation of the report: on the domain robustness of prefix and prompt tuning☆20Mar 10, 2022Updated 3 years ago
- Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-Ranking☆25Apr 4, 2025Updated 11 months ago
- Model implementation for the contextual embeddings project☆41Jun 2, 2025Updated 9 months ago
- ☆57Jan 26, 2025Updated last year
- ☆47Feb 7, 2024Updated 2 years ago
- ☆21Apr 17, 2023Updated 2 years ago
- Source code for SummaReranker (ACL 2022)☆25Jan 7, 2024Updated 2 years ago
- This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Pr…☆26Jun 27, 2022Updated 3 years ago
- Checkout the new version at the link!☆22Dec 11, 2020Updated 5 years ago
- Big Data Analysis of Tinder done at Universitat Rovira i Virgili and Universitat Politècnica de Catalunya · BarcelonaTech☆13Jan 3, 2023Updated 3 years ago
- ☆59Nov 17, 2025Updated 3 months ago
- https://hapo31.github.io/charcoal☆11Dec 27, 2020Updated 5 years ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆34Aug 24, 2024Updated last year
- KETOD Knowledge-Enriched Task-Oriented Dialogue☆32Jan 4, 2023Updated 3 years ago
- Deep Learning Utilities for PyTorch users (old name: Zero)☆39Apr 21, 2025Updated 10 months ago
- Distilling Task-Specific Knowledge from Teacher Model into BiLSTM☆32Dec 14, 2024Updated last year
- Crispy reranking models by Mixedbread☆47Sep 17, 2025Updated 5 months ago
- 書籍「Python自然言語処理入門」用サポートサイト☆13Mar 25, 2020Updated 5 years ago
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- User-friendly viewer for Parquet files☆10Jan 10, 2026Updated last month
- Collection of scripts to pretrain T5 in unsupervised text, using PyTorch Lightning. CORD-19 pretraining provided as example.☆32Apr 26, 2021Updated 4 years ago
- DOS Program Development☆13Nov 9, 2022Updated 3 years ago
- fine-tuning tutorial☆18Feb 20, 2026Updated 2 weeks ago
- SimADFuzz: Simulation-Feedback Fuzz Testing for Autonomous Driving Systems☆10Apr 11, 2025Updated 10 months ago