aju22 / LLaMA2Links
This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT) variant. The implementation focuses on the model architecture and the inference process. The code is restructured and heavily commented to facilitate easy understanding of the key parts of the architecture.
☆74Updated 2 years ago
Alternatives and similar repositories for LLaMA2
Users that are interested in LLaMA2 are comparing it to the libraries listed below
Sorting:
- A family of compressed models obtained via pruning and knowledge distillation☆361Updated last month
- Official PyTorch implementation of QA-LoRA☆145Updated last year
- An extension of the nanoGPT repository for training small MOE models.☆218Updated 9 months ago
- Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind☆107Updated last year
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆185Updated last month
- Training code for Baby-Llama, our submission to the strict-small track of the BabyLM challenge.☆85Updated 2 years ago
- LLaMA 2 implemented from scratch in PyTorch☆363Updated 2 years ago
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆244Updated 9 months ago
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆176Updated last year
- ☆235Updated last year
- LoRA and DoRA from Scratch Implementations☆215Updated last year
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆349Updated 7 months ago
- Explorations into some recent techniques surrounding speculative decoding☆295Updated 11 months ago
- Code for studying the super weight in LLM☆121Updated last year
- ☆204Updated last year
- Prune transformer layers☆74Updated last year
- For releasing code related to compression methods for transformers, accompanying our publications☆452Updated 11 months ago
- Low-bit optimizers for PyTorch