hkproj / pytorch-transformer-distributed
Distributed training (multi-node) of a Transformer model
☆59Updated 11 months ago
Alternatives and similar repositories for pytorch-transformer-distributed:
Users that are interested in pytorch-transformer-distributed are comparing it to the libraries listed below
- Prune transformer layers☆68Updated 9 months ago
- ☆136Updated 2 months ago
- Notes on Direct Preference Optimization☆18Updated 11 months ago
- ☆158Updated last month
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆47Updated 10 months ago
- ☆41Updated 11 months ago
- Notes on quantization in neural networks☆77Updated last year
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆164Updated 3 months ago
- Notes and commented code for RLHF (PPO)☆77Updated last year
- LoRA and DoRA from Scratch Implementations☆198Updated last year
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆277Updated 3 weeks ago
- Unofficial implementation of https://arxiv.org/pdf/2407.14679☆44Updated 6 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated 11 months ago
- From scratch implementation of a vision language model in pure PyTorch☆205Updated 10 months ago
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆99Updated last year
- ☆173Updated 3 months ago
- LLaMA 2 implemented from scratch in PyTorch☆307Updated last year
- ☆151Updated last year
- ☆82Updated 5 months ago
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆154Updated 9 months ago
- Train, tune, and infer Bamba model☆86Updated 2 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆150Updated 3 months ago
- Collection of autoregressive model implementation☆83Updated last month
- Code for studying the super weight in LLM☆91Updated 3 months ago