EleutherAI/improved-t5

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/EleutherAI/improved-t5)

EleutherAI / improved-t5

Experiments for efforts to train a new and improved t5

☆76

Alternatives and similar repositories for improved-t5

Users that are interested in improved-t5 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fyvo / WMT-Biomed-Test
View on GitHub
☆13Aug 23, 2024Updated last year
VITA-Group / Robust_Weight_Signatures
View on GitHub
[ICML 2023] "Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?" by Ruisi Cai, Zhenyu Zhang, Zhangyang Wang
☆16May 4, 2023Updated 3 years ago
catie-aq / flashT5
View on GitHub
A fast implementation of T5/UL2 in PyTorch using Flash Attention
☆116Oct 30, 2025Updated 8 months ago
zbambergerNLP / principled-pre-training
View on GitHub
A repository to get acquainted with basic training tasks in natural language processing and machine learning
☆11Dec 27, 2023Updated 2 years ago
davisrbr / conjectures-arxiv
View on GitHub
OpenConjecture, a dataset of mathematics conjectures pulled from papers published to the ArXiv
☆15Jul 12, 2026Updated last week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sekstini / basedxl
View on GitHub
☆18Mar 18, 2024Updated 2 years ago
kaiokendev / cutoff-len-is-context-len
View on GitHub
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆62Jun 21, 2023Updated 3 years ago
drarijitdas / Natural-GaLore
View on GitHub
An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace
☆19Oct 21, 2024Updated last year
PiotrNawrot / nanoT5
View on GitHub
Fast & Simple repository for pre-training and fine-tuning T5-style models
☆1,021Aug 21, 2024Updated last year
p1nksnow / MoICE
View on GitHub
Official implementation for "Mixture of In-Context Experts Enhance LLMs’ Awareness of Long Contexts" (Accepted by Neurips2024)
☆14Jan 7, 2025Updated last year
AnswerDotAI / sqlite-minutils
View on GitHub
A fork of sqlite-utils with CLI etc removed
☆17Jul 11, 2026Updated last week
huggingface / hf-endpoints-documentation
View on GitHub
☆27Jun 23, 2026Updated 3 weeks ago
socialfoundations / benchbench
View on GitHub
BenchBench is a Python package to evaluate multi-task benchmarks.
☆23Oct 12, 2025Updated 9 months ago
iliaschalkidis / flash-roberta
View on GitHub
Hugging Face RoBERTa with Flash Attention 2
☆24Sep 14, 2025Updated 10 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
Jiangtong-Li / Subword-ELMo
View on GitHub
☆12Mar 20, 2020Updated 6 years ago
mlfoundations / scaling
View on GitHub
Language models scale reliably with over-training and on downstream tasks
☆102Apr 2, 2024Updated 2 years ago
TaiMingLu / know-dont-tell
View on GitHub
☆19Oct 14, 2024Updated last year
tau-nlp / zero_scrolls
View on GitHub
Running inference on the ZeroSCROLLS benchmark
☆22Apr 18, 2024Updated 2 years ago
HazyResearch / train-tk
View on GitHub
train with kittens!
☆66Oct 25, 2024Updated last year
SawyerHood / develop.sh
View on GitHub
☆21May 26, 2024Updated 2 years ago
rhubarbwu / linguistic-collapse
View on GitHub
Codebase for Linguistic Collapse: Neural Collapse in (Large) Language Models [NeurIPS 2024] [arXiv:2405.17767]
☆19Apr 14, 2025Updated last year
JHU-CLSP / ettin-encoder-vs-decoder
View on GitHub
State-of-the-art paired encoder and decoder models (17M-1B params)
☆74Aug 6, 2025Updated 11 months ago
Yuanhy1997 / HyPe
View on GitHub
HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]
☆14Jul 11, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
samblouir / birdie
View on GitHub
☆15Jun 8, 2026Updated last month
jkallini / mrt5
View on GitHub
Code repository for the paper "MrT5: Dynamic Token Merging for Efficient Byte-level Language Models."
☆59Sep 25, 2025Updated 9 months ago
NathanGodey / headless-lm
View on GitHub
Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…
☆29Apr 17, 2024Updated 2 years ago
pdufter / minimult
View on GitHub
Analyzing mBERT's multilinguality in a small laboratory setting
☆13Jun 12, 2023Updated 3 years ago
joeljang / ELM
View on GitHub
[ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning
☆99Apr 26, 2023Updated 3 years ago
epfl-dlab / pairformance
View on GitHub
Tool to perform paired evaluation of automatic systems
☆13Oct 20, 2021Updated 4 years ago
allenai / easy-to-hard-generalization
View on GitHub
Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"
☆48Jan 17, 2024Updated 2 years ago
interlocklabs / exp-1.0.0-promptlab
View on GitHub
A web app to experiment with chained prompts faster.
☆16Mar 15, 2023Updated 3 years ago
mlfoundations / open_lm
View on GitHub
A repository for research on medium sized language models.
☆537Jun 6, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
remote-startup-senpai / anime-background-gan-hf-space
View on GitHub
GitHub repository linked to AnimeBackgroundGAN HuggingFace Space
☆10May 24, 2022Updated 4 years ago
tml-tuebingen / chatgpt-algorithm-exam
View on GitHub
ChatGPT Participates in a Computer Science Exam (2023)
☆31Mar 21, 2023Updated 3 years ago
ielab / CharacterBERT-DR
View on GitHub
The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGI…
☆16May 4, 2022Updated 4 years ago
MilesCranmer / pysr_scaling_laws
View on GitHub
You should use PySR to find scaling laws. Here's an example.
☆34Sep 30, 2023Updated 2 years ago
modestyachts / cifar-10.2
View on GitHub
Host CIFAR-10.2 Data Set
☆13Sep 22, 2021Updated 4 years ago
JunjieHu / amber
View on GitHub
Explicit Alignment Objectives for Multilingual Bidirectional Encoders
☆14Apr 14, 2021Updated 5 years ago
apple / ml-reversal-blessing
View on GitHub
☆17Jul 31, 2025Updated 11 months ago