hkproj / pytorch-llama
LLaMA 2 implemented from scratch in PyTorch
☆216Updated 11 months ago
Related projects: ⓘ
- Scalable toolkit for efficient model alignment☆509Updated this week
- LoRA and DoRA from Scratch Implementations☆179Updated 6 months ago
- A family of compressed models obtained via pruning and knowledge distillation☆241Updated 3 weeks ago
- Official PyTorch implementation of QA-LoRA☆111Updated 6 months ago
- Minimalistic large language model 3D-parallelism training☆1,116Updated this week
- distributed trainer for LLMs☆524Updated 4 months ago
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆539Updated 6 months ago
- ☆629Updated last week
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆72Updated last year
- LLM Workshop by Sourab Mangrulkar☆322Updated 3 months ago
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)…☆47Updated 11 months ago
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆608Updated last month
- Official repository for ORPO☆409Updated 3 months ago
- Llama from scratch, or How to implement a paper without crying☆499Updated 3 months ago
- Transformers with Arbitrarily Large Context☆613Updated last month
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆452Updated last month
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆309Updated 2 weeks ago
- For releasing code related to compression methods for transformers, accompanying our publications☆356Updated 2 weeks ago
- PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention…☆271Updated 4 months ago
- Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM☆416Updated this week
- Reference implementation of Mistral AI 7B v0.1 model.☆26Updated 8 months ago
- Training and Fine-tuning an llm in Python and PyTorch.☆38Updated last year
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆530Updated 4 months ago
- [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization☆629Updated last month
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆416Updated 6 months ago
- An Open Source Toolkit For LLM Distillation☆284Updated last month
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆248Updated 10 months ago
- Official Implementation of EAGLE-1 and EAGLE-2☆749Updated 3 weeks ago
- The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction☆361Updated 2 months ago
- A bagel, with everything.☆306Updated 5 months ago