jkallini / mrt5Links

Code repository for the paper "MrT5: Dynamic Token Merging for Efficient Byte-level Language Models."

☆51

Alternatives and similar repositories for mrt5

Users that are interested in mrt5 are comparing it to the libraries listed below

Sorting:

ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆60Updated last year
RobertCsordas / moeut
☆89Updated last year
facebookresearch / mexma
MEXMA: Token-level objectives improve sentence representations
☆42Updated 11 months ago
bminixhofer / zett
Code for Zero-Shot Tokenizer Transfer
☆142Updated 10 months ago
GenRobo / MatMamba
Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"
☆61Updated last year
zaydzuhri / flame
Fork of Flame repo for training of some new stuff in development
☆19Updated this week
epfml / DenseFormer
☆82Updated last year
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
bminixhofer / tokenkit
A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.
☆55Updated 5 months ago
Aleph-Alpha-Research / trigrams
☆58Updated 2 weeks ago
LAGoM-NLP / transtokenizer
☆55Updated 10 months ago
TristanThrush / i-am-a-strange-dataset
Repository for "I am a Strange Dataset: Metalinguistic Tests for Language Models"
☆45Updated last year
dmis-lab / Monet
[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
☆74Updated 5 months ago
r-three / phatgoose
Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"
☆91Updated last year
SeunghyunSEO / optimized_hf_llama_class_for_training
☆48Updated last year
Qichuzyy / POA
Official implementation of ECCV24 paper: POA
☆24Updated last year
trapoom555 / Language-Model-STS-CFT
Improving Text Embedding of Language Models Using Contrastive Fine-tuning
☆66Updated last year
ltgoslo / gpt-bert
Official implementation of "GPT or BERT: why not both?"
☆63Updated 4 months ago
recursal / GoldFinch-paper
GoldFinch and other hybrid transformer components
☆45Updated last year
ahstat / episodic-memory-benchmark
Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…
☆60Updated 2 months ago
kjslag / spacebyte
A byte-level decoder architecture that matches the performance of tokenized Transformers.
☆66Updated last year
ltgoslo / bert-in-context
Official implementation of "BERTs are Generative In-Context Learners"
☆32Updated 8 months ago
sanyalsunny111 / LLM-Inheritune
This is the official repository for Inheritune.
☆115Updated 9 months ago
catie-aq / flashT5
A fast implementation of T5/UL2 in PyTorch using Flash Attention
☆112Updated last month
allenai / infinigram-api
☆87Updated this week
microsoft / encoder-decoder-slm
Efficient encoder-decoder architecture for small language models (≤1B parameters) with cross-architecture knowledge distillation and visi…
☆33Updated 10 months ago
g-luo / vlm_cross_modal_reps
Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025
☆31Updated 7 months ago
devvrit / matformer
MatFormer repo
☆66Updated 11 months ago
Knowledgator / TurboT5
Truly flash T5 realization!
☆71Updated last year
KaiNylund / lm-weights-encode-time
☆69Updated last year