State-of-the-art paired encoder and decoder models (17M-1B params)
☆64Aug 6, 2025Updated 7 months ago
Alternatives and similar repositories for ettin-encoder-vs-decoder
Users that are interested in ettin-encoder-vs-decoder are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository helps you evaluate your models on the FreshStack benchmark!☆33Dec 9, 2025Updated 3 months ago
- ☆95Jul 4, 2025Updated 8 months ago
- Documenting large text datasets 🖼️ 📚☆14Dec 17, 2024Updated last year
- Official Repository for "Hypencoder: Hypernetworks for Information Retrieval"☆35Sep 20, 2025Updated 6 months ago
- Code for SaGe subword tokenizer (EACL 2023)☆28Nov 30, 2024Updated last year
- Experiments for efforts to train a new and improved t5☆76Apr 15, 2024Updated last year
- [NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference☆36Oct 29, 2025Updated 4 months ago
- One-stop shop for running and fine-tuning transformer-based language models for retrieval☆63Mar 14, 2026Updated last week
- Tool to perform paired evaluation of automatic systems☆13Oct 20, 2021Updated 4 years ago
- The training codes of Jasper-Token-Compression-600M☆19Nov 19, 2025Updated 4 months ago
- ACL22 paper: Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost☆42Nov 15, 2023Updated 2 years ago
- Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking☆13Feb 5, 2023Updated 3 years ago
- ☆108Jun 2, 2025Updated 9 months ago
- ☆13Oct 2, 2023Updated 2 years ago
- Code Implementation for "NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models" (EMNLP …☆17Oct 17, 2023Updated 2 years ago
- Contextualized per-token embeddings☆34May 11, 2025Updated 10 months ago
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆32Jun 23, 2025Updated 9 months ago
- [ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…☆14Jun 6, 2025Updated 9 months ago
- SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)☆16Jul 27, 2024Updated last year
- Repository for the code of the "PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided Decoding" paper, NAACL'22☆66Oct 25, 2022Updated 3 years ago
- [TMLR 2025 & ICLR 2025 DeLTa] Official Implementation of Design Editing for Offline Model-based Optimization 🧬 🤖☆10Apr 17, 2025Updated 11 months ago
- a benchmark to evaluate the situated inductive reasoning☆15Jan 7, 2025Updated last year
- ☆16Jun 14, 2024Updated last year
- Minimalist implementation of a GPT2 with Language Model Head with PyTorch Lightning, Transformers and PyTorch-NLP.☆24Jun 12, 2023Updated 2 years ago
- ☆15Jun 19, 2025Updated 9 months ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆27Nov 25, 2024Updated last year
- Official implementation of NeurIPS'24 Spotlight paper "Monte Carlo Tree Search based Space Transfer for Black-box Optimization".☆13Nov 28, 2024Updated last year
- Natural Perturbation for Robust Question Answering☆12Apr 7, 2020Updated 5 years ago
- Code and datasets for EMNLP 2022 paper: Beyond prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Repr…☆19Jan 1, 2024Updated 2 years ago
- ☆30Dec 23, 2025Updated 3 months ago
- Scaling Laws for Mixture of Experts Models☆15Feb 25, 2025Updated last year
- ☆114Jun 9, 2022Updated 3 years ago
- ☆15Dec 15, 2025Updated 3 months ago
- ☆14Nov 2, 2022Updated 3 years ago
- Interactive documentation and programming with Scala, iPython notebook style.☆19Mar 9, 2016Updated 10 years ago
- NLP with Rust for Python 🦀🐍☆72May 13, 2025Updated 10 months ago
- ALBERT Persian Playground☆13Jun 12, 2023Updated 2 years ago
- Collection of LLM completions for reasoning-gym task datasets☆31Jul 4, 2025Updated 8 months ago
- A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.☆64Jul 6, 2025Updated 8 months ago