Aaquib111 / Sparse-GPT-Finetuning
Code for my ICLR 2024 TinyPapers paper "Prune and Tune: Improving Efficient Pruning Techniques for Massive Language Models"
☆13Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Sparse-GPT-Finetuning
- Here we will test various linear attention designs.☆56Updated 6 months ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆52Updated last week
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆43Updated 4 months ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆42Updated last week
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆28Updated 5 months ago
- Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"☆28Updated 7 months ago
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆21Updated last year
- ☆50Updated last month
- An Experiment on Dynamic NTK Scaling RoPE☆61Updated 11 months ago
- A toolkit enhances PyTorch with specialized functions for low-bit quantized neural networks.☆28Updated 4 months ago
- Code for paper "Patch-Level Training for Large Language Models"☆71Updated last week
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- ☆18Updated 3 months ago
- ☆64Updated last month
- Easy control for Key-Value Constrained Generative LLM Inference(https://arxiv.org/abs/2402.06262)☆58Updated 9 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆19Updated 2 months ago
- A Closer Look into Mixture-of-Experts in Large Language Models☆40Updated 3 months ago
- Codebase for Instruction Following without Instruction Tuning☆32Updated last month
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆38Updated 10 months ago
- FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation☆46Updated 4 months ago
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆12Updated 3 weeks ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆27Updated 3 weeks ago
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆29Updated last week
- ☆58Updated 5 months ago
- ☆35Updated 9 months ago
- ☆45Updated 4 months ago
- Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆22Updated last month
- EE-LLM is a framework for large-scale training and inference of early-exit (EE) large language models (LLMs).☆49Updated 5 months ago
- ☆30Updated this week
- Repository for CPU Kernel Generation for LLM Inference☆25Updated last year