imbue-ai / carbs
Cost aware hyperparameter tuning algorithm
☆119Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for carbs
- seqax = sequence modeling + JAX☆132Updated 3 months ago
- ☆197Updated 3 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆83Updated last week
- Efficient baselines for autocurricula in JAX.☆172Updated 2 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆112Updated 6 months ago
- Scalable neural net training via automatic normalization in the modular norm.☆118Updated 2 months ago
- A set of Python scripts that makes your experience on TPU better☆40Updated 4 months ago
- Efficient World Models with Context-Aware Tokenization. ICML 2024☆84Updated last month
- A simple library for scaling up JAX programs☆125Updated last week
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- Minimal but scalable implementation of large language models in JAX☆25Updated last week
- ☆116Updated this week
- fast + parallel AlphaZero in JAX☆84Updated 7 months ago
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆46Updated last week
- Simple single-file baselines for Q-Learning in pure-GPU setting☆93Updated 3 months ago
- Understand and test language model architectures on synthetic tasks.☆161Updated 6 months ago
- ☆72Updated 4 months ago
- ☆53Updated 9 months ago
- ☆223Updated 3 months ago
- ☆99Updated 3 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆516Updated this week
- Solve puzzles. Learn CUDA.☆60Updated 10 months ago
- LeanRL is a fork of CleanRL, where selected PyTorch scripts optimized for performance using compile and cudagraphs.☆429Updated 2 weeks ago
- LoRA for arbitrary JAX models and functions☆131Updated 8 months ago
- Inference code for LLaMA models in JAX☆112Updated 5 months ago
- Simple Transformer in Jax☆115Updated 4 months ago
- σ-GPT: A New Approach to Autoregressive Models☆59Updated 2 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆176Updated 5 months ago
- JAX implementation of the Llama 2 model☆210Updated 9 months ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆29Updated last week