☆259Jun 6, 2025Updated 9 months ago
Alternatives and similar repositories for meliad
Users that are interested in meliad are comparing it to the libraries listed below
Sorting:
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Jun 21, 2023Updated 2 years ago
- Source-to-Source Debuggable Derivatives in Pure Python☆15Jan 23, 2024Updated 2 years ago
- Sequence modeling with Mega.☆303Jan 28, 2023Updated 3 years ago
- ☆52Jan 19, 2023Updated 3 years ago
- ☆23Oct 15, 2022Updated 3 years ago
- Convolutions for Sequence Modeling☆913Jun 13, 2024Updated last year
- ☆13Aug 23, 2024Updated last year
- Convenient Text-to-Text Training for Transformers☆19Dec 10, 2021Updated 4 years ago
- An annotated implementation of the Hyena Hierarchy paper☆34May 28, 2023Updated 2 years ago
- ☆20May 30, 2024Updated last year
- Understand and test language model architectures on synthetic tasks.☆257Feb 24, 2026Updated last week
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- ☆19Dec 4, 2025Updated 3 months ago
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆14Jul 11, 2023Updated 2 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- ☆10Dec 17, 2020Updated 5 years ago
- The official Languini Kitchen repository☆14May 6, 2024Updated last year
- ☆35Apr 12, 2024Updated last year
- Large Context Attention☆769Oct 13, 2025Updated 4 months ago
- Official code for Long Expressive Memory (ICLR 2022, Spotlight)☆71Mar 11, 2022Updated 3 years ago
- Scripts for downloading and pre-processing the `proof-pile`, a high quality dataset of mathematical text and code.☆22Nov 26, 2022Updated 3 years ago
- Performant, composable online learning☆16Feb 22, 2021Updated 5 years ago
- ☆13Jun 16, 2021Updated 4 years ago
- Rust bindings for CTranslate2☆14Jun 21, 2023Updated 2 years ago
- Hrrformer: A Neuro-symbolic Self-attention Model (ICML23)☆62Oct 8, 2025Updated 4 months ago
- JAX/Flax implementation of the Hyena Hierarchy☆34Apr 27, 2023Updated 2 years ago
- Experiments for efforts to train a new and improved t5☆76Apr 15, 2024Updated last year
- Sequence Modeling with Structured State Spaces☆67Aug 2, 2022Updated 3 years ago
- [ACL‘20] Highway Transformer: A Gated Transformer.☆33Dec 5, 2021Updated 4 years ago
- Task-based datasets, preprocessing, and evaluation for sequence models.☆594Feb 3, 2026Updated last month
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)☆15Jan 7, 2025Updated last year
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- [EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitz…☆14Oct 17, 2023Updated 2 years ago
- ☆13Feb 7, 2023Updated 3 years ago
- Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.☆147Jul 26, 2021Updated 4 years ago
- ☆78Dec 7, 2023Updated 2 years ago
- ☆77Apr 29, 2024Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆695Jan 26, 2026Updated last month
- Open weights language model from Google DeepMind, based on Griffin.☆663Feb 6, 2026Updated last month