ayaka14732 / llama-jaxLinks
JAX implementation of LLaMA, aiming to train LLaMA on Google Cloud TPU
☆14Updated 2 years ago
Alternatives and similar repositories for llama-jax
Users that are interested in llama-jax are comparing it to the libraries listed below
Sorting:
- ☆13Updated 2 years ago
- JAX implementation of the Llama 2 model☆216Updated last year
- ☆71Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated 2 years ago
- ☆20Updated 2 years ago
- train with kittens!☆63Updated last year
- Inference code for LLaMA models☆41Updated 2 years ago
- Inference code for LLaMA models in JAX☆120Updated last year
- Gpu benchmark☆74Updated last year
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆131Updated last year
- Machine Learning eXperiment Utilities☆48Updated 6 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated last year
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆280Updated 2 years ago
- Plug in & Play Pytorch Implementation of the paper: "Evolutionary Optimization of Model Merging Recipes" by Sakana AI☆31Updated last year
- RWKV-7: Surpassing GPT☆104Updated last year
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs☆110Updated 2 years ago
- Collection of autoregressive model implementation☆85Updated 2 weeks ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Updated last year
- E2E AutoML Model Compression Package☆45Updated 10 months ago
- Data preparation code for Amber 7B LLM☆94Updated last year
- QuIP quantization☆61Updated last year
- The Next Generation Multi-Modality Superintelligence☆70Updated last year
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆66Updated 2 months ago
- LLM training in simple, raw C/CUDA☆18Updated last year
- Train, tune, and infer Bamba model☆138Updated 7 months ago
- ☆112Updated 2 years ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆18Updated 6 months ago
- A really tiny autograd engine☆99Updated 8 months ago
- Evaluating the Mamba architecture on the Othello game☆49Updated last year
- JAX implementation of the Mistral 7b v0.2 model☆35Updated last year