ztjhz / t5-jaxLinks
JAX implementation of the T5 model: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
☆24Updated 2 years ago
Alternatives and similar repositories for t5-jax
Users that are interested in t5-jax are comparing it to the libraries listed below
Sorting:
- Implementation of VQ-VAE with a GPT-style sampler in the JAX and Haiku ecosystem.☆12Updated 2 years ago
- Official code for the paper "Context-Aware Language Modeling for Goal-Oriented Dialogue Systems"☆34Updated 3 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆92Updated 2 years ago
- Train very large language models in Jax.☆210Updated 2 years ago
- ☆31Updated 3 years ago
- ☆63Updated 3 years ago
- DiT (training + flow matching) in Jax☆11Updated last year
- HomebrewNLP in JAX flavour for maintable TPU-Training☆51Updated 2 years ago
- PyTorch Package For Quasimetric Learning☆45Updated last year
- Machine Learning eXperiment Utilities☆48Updated 6 months ago
- Repo to reproduce the First-Explore paper results☆39Updated last year
- LoRA for arbitrary JAX models and functions☆144Updated last year
- INTeractive learning via REPresentatIon Discovery☆36Updated last year
- Official Implementation of NeurIPS'23 Paper "Cross-Episodic Curriculum for Transformer Agents"☆31Updated 2 years ago
- ☆19Updated 2 years ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆83Updated 3 years ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Updated last year
- ☆31Updated last week
- JAX Implementation of Black Forest Labs' Flux.1 family of models☆40Updated 2 months ago
- Intrinsic Motivation from Artificial Intelligence Feedback☆135Updated 2 years ago
- Implementation of GateLoop Transformer in Pytorch and Jax☆92Updated last year
- TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.☆21Updated 4 months ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Updated 4 years ago
- Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observ…☆35Updated 7 months ago
- Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"☆210Updated 2 years ago
- A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks☆36Updated last year
- Gym environment for playing Wordle with RL agents☆42Updated 3 years ago
- Experiments on GPT-3's ability to fit numerical models in-context.☆14Updated 3 years ago
- ☆35Updated 3 years ago
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols☆16Updated 4 years ago