ayaka14732 / llama-jax
JAX implementation of LLaMA, aiming to train LLaMA on Google Cloud TPU
☆14Updated last year
Related projects ⓘ
Alternatives and complementary repositories for llama-jax
- JAX implementation of the Mistral 7b v0.2 model☆33Updated 4 months ago
- ☆20Updated last year
- A set of Python scripts that makes your experience on TPU better☆40Updated 4 months ago
- Machine Learning eXperiment Utilities☆45Updated 5 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆36Updated last year
- Inference code for LLaMA models in JAX☆113Updated 6 months ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- Gpu benchmark☆43Updated last month
- RWKV model implementation☆38Updated last year
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆43Updated 4 months ago
- ☆47Updated 9 months ago
- Code repository for the c-BTM paper☆105Updated last year
- Einsum-like high-level array sharding API for JAX☆32Updated 4 months ago
- ☆50Updated 6 months ago
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 7 months ago
- JAX implementation of the Llama 2 model☆210Updated 9 months ago
- ☆25Updated last year
- ☆46Updated last week
- ☆13Updated 4 months ago
- ☆55Updated last month
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆51Updated 5 months ago
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆122Updated last year
- ☆101Updated 3 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆46Updated 2 months ago
- ☆35Updated 7 months ago
- A toolkit for scaling law research ⚖☆43Updated 8 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆50Updated 7 months ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆41Updated 10 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆61Updated 7 months ago