Aaquib111 / Sparse-GPT-Finetuning
Code for my ICLR 2024 TinyPapers paper "Prune and Tune: Improving Efficient Pruning Techniques for Massive Language Models"
☆14Updated last year
Alternatives and similar repositories for Sparse-GPT-Finetuning:
Users that are interested in Sparse-GPT-Finetuning are comparing it to the libraries listed below
- Codebase for Instruction Following without Instruction Tuning☆34Updated 6 months ago
- Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)☆22Updated 5 months ago
- A repository for research on medium sized language models.☆76Updated 10 months ago
- ☆55Updated last month
- Code for preprint "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"☆36Updated 2 weeks ago
- Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"☆27Updated last year
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆73Updated 10 months ago
- ☆15Updated this week
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆42Updated 5 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Updated last year
- Here we will test various linear attention designs.☆60Updated 11 months ago
- ☆17Updated 6 months ago
- ☆76Updated 2 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆43Updated 8 months ago
- Repository for the paper: 500xCompressor: Generalized Prompt Compression for Large Language Models☆33Updated 7 months ago
- Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?☆16Updated last month
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆28Updated 3 weeks ago
- The rule-based evaluation subset and code implementation of Omni-MATH☆18Updated 3 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆48Updated 2 weeks ago
- QuIP quantization☆52Updated last year
- [NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI☆99Updated last month
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆30Updated this week
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆13Updated last month
- ☆42Updated last month
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆16Updated 6 months ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆54Updated 2 weeks ago
- ☆44Updated last month
- Long Context Extension and Generalization in LLMs☆53Updated 6 months ago
- Exploration of automated dataset selection approaches at large scales.☆35Updated last month
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆31Updated last month