warner-benjamin / optimi
Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers
☆59Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for optimi
- ☆20Updated last year
- ☆49Updated 8 months ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- Experiments with generating opensource language model assistants☆97Updated last year
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆71Updated last month
- ☆73Updated 4 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆84Updated this week
- ☆77Updated 5 months ago
- Various transformers for FSDP research☆33Updated 2 years ago
- A library for squeakily cleaning and filtering language datasets.☆45Updated last year
- Experiments for efforts to train a new and improved t5☆76Updated 7 months ago
- ☆77Updated 7 months ago
- A byte-level decoder architecture that matches the performance of tokenized Transformers.☆59Updated 6 months ago
- ☆45Updated 2 months ago
- Muon optimizer for neural networks: >30% extra sample efficiency, <3% wallclock overhead☆121Updated this week
- HomebrewNLP in JAX flavour for maintable TPU-Training☆46Updated 10 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- code for training & evaluating Contextual Document Embedding models☆119Updated this week
- Official implementation of "GPT or BERT: why not both?"☆37Updated last week
- Understand and test language model architectures on synthetic tasks.