hkproj / pytorch-llama-notes
Notes about LLaMA 2 model
☆47Updated last year
Related projects ⓘ
Alternatives and complementary repositories for pytorch-llama-notes
- LLaMA 2 implemented from scratch in PyTorch☆258Updated last year
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)…☆54Updated last year
- Notes on the Mistral AI model☆18Updated 10 months ago
- A repository dedicated to evaluating the performance of quantizied LLaMA3 using various quantization methods..☆166Updated 3 months ago
- ring-attention experiments☆97Updated last month
- Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind☆82Updated 8 months ago
- Distributed training (multi-node) of a Transformer model☆43Updated 7 months ago
- Notes on quantization in neural networks☆58Updated 11 months ago
- Awesome list for LLM quantization☆127Updated this week
- ☆40Updated 7 months ago
- Reference implementation of Mistral AI 7B v0.1 model.☆27Updated 10 months ago
- Training and Fine-tuning an llm in Python and PyTorch.☆41Updated last year
- Official PyTorch implementation of QA-LoRA☆117Updated 8 months ago
- The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>☆101Updated last week
- ☆134Updated last year
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆135Updated 5 months ago
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆143Updated last week
- ☆21Updated last year
- Unofficial implementation of https://arxiv.org/pdf/2407.14679☆36Updated 2 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆252Updated last year
- Awesome Mobile LLMs☆87Updated 2 weeks ago
- BERT explained from scratch☆12Updated last year
- Training code for Baby-Llama, our submission to the strict-small track of the BabyLM challenge.☆68Updated last year
- awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.☆154Updated this week
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆82Updated last year
- ☆199Updated 5 months ago
- ☆111Updated 8 months ago
- ☆47Updated 2 months ago
- The official implementation of the paper "Demystifying the Compression of Mixture-of-Experts Through a Unified Framework".☆48Updated 3 weeks ago
- Notes and commented code for RLHF (PPO)☆38Updated 8 months ago