aju22 / LLaMA2Links

This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT) variant. The implementation focuses on the model architecture and the inference process. The code is restructured and heavily commented to facilitate easy understanding of the key parts of the architecture.

☆72

Alternatives and similar repositories for LLaMA2

Users that are interested in LLaMA2 are comparing it to the libraries listed below

Sorting:

yuhuixu1993 / qa-lora
Official PyTorch implementation of QA-LoRA
☆141Updated last year
shreyansh26 / Speculative-Sampling
Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind
☆104Updated last year
yxli2123 / LoftQ
☆230Updated last year
NVlabs / Minitron
A family of compressed models obtained via pruning and knowledge distillation
☆352Updated 11 months ago
wolfecameron / nanoMoE
An extension of the nanoGPT repository for training small MOE models.
☆197Updated 7 months ago
astramind-ai / Mixture-of-depths
Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
☆173Updated last year
CASE-Lab-UMD / LLM-Drop
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
☆177Updated 6 months ago
jshuadvd / LongRoPE
Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
☆150Updated last year
lucidrains / speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
☆288Updated 9 months ago
pratyushasharma / laser
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
☆388Updated last year
arcee-ai / PruneMe
Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models
☆249Updated last year
thu-ml / low-bit-optimizers
Low-bit optimizers for PyTorch
☆131Updated 2 years ago
timinar / BabyLlama
Training code for Baby-Llama, our submission to the strict-small track of the BabyLM challenge.
☆84Updated 2 years ago
mengxiayu / LLMSuperWeight
Code for studying the super weight in LLM
☆120Updated 10 months ago
alperiox / Compact-Language-Models-via-Pruning-and-Knowledge-Distillation
Unofficial implementation of https://arxiv.org/pdf/2407.14679
☆49Updated last year
itsnamgyu / block-transformer
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
☆162Updated 6 months ago
dwzhu-pku / PoSE
Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)
☆205Updated last year
jongwooko / distillm
Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)
☆233Updated 7 months ago
microsoft / TransformerCompression
For releasing code related to compression methods for transformers, accompanying our publications
☆446Updated 9 months ago
FasterDecoding / BitDelta
☆201Updated 10 months ago
hkproj / pytorch-llama
LLaMA 2 implemented from scratch in PyTorch
☆355Updated 2 years ago
NetEase-FuXi / EETQ
Easy and Efficient Quantization for Transformers
☆203Updated 3 months ago
keeeeenw / MicroLlama
Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
☆161Updated 2 months ago
HanGuo97 / lq-lora
☆127Updated last year
HKUNLP / ChunkLlama
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
☆440Updated last year
jaymody / speculative-sampling
Simple implementation of Speculative Sampling in NumPy for GPT-2.
☆96Updated 2 years ago
rasbt / dora-from-scratch
LoRA and DoRA from Scratch Implementations
☆211Updated last year
VITA-Group / Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
☆201Updated last year
catid / dora
Implementation of DoRA
☆302Updated last year
dust-tt / llama-ssp
Experiments on speculative sampling with Llama models
☆126Updated 2 years ago