[TMLR 2025] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models
☆125Feb 15, 2026Updated 2 weeks ago
Alternatives and similar repositories for LLM-Inheritune
Users that are interested in LLM-Inheritune are comparing it to the libraries listed below
Sorting:
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆203Jul 17, 2024Updated last year
- ☆109Jul 15, 2025Updated 7 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Feb 29, 2024Updated 2 years ago
- FuseAI Project☆590Jan 25, 2025Updated last year
- Official code for the paper "Attention as a Hypernetwork"☆51Feb 24, 2026Updated last week
- AirLLM 70B inference with single 4GB GPU☆17Jun 27, 2025Updated 8 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆92Oct 30, 2024Updated last year
- MiSS is a novel PEFT method that features a low-rank structure but introduces a new update mechanism distinct from LoRA, achieving an exc…☆31Jan 28, 2026Updated last month
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆78Mar 12, 2024Updated last year
- [ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear at…☆103Jun 14, 2024Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated last year
- Unofficial Implementation of Evolutionary Model Merging☆41Mar 28, 2024Updated last year
- Unit Scaling demo and experimentation code☆16Mar 12, 2024Updated last year
- An automated data pipeline scaling RL to pretraining levels☆72Oct 11, 2025Updated 4 months ago
- Code for studying the super weight in LLM☆122Dec 3, 2024Updated last year
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Mar 1, 2024Updated 2 years ago
- PB-LLM: Partially Binarized Large Language Models☆156Nov 20, 2023Updated 2 years ago
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Jun 27, 2024Updated last year
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆156Apr 7, 2025Updated 10 months ago
- Deploying full-stack on-prem deep research agent that can be run entirely on a local machine for $0!☆31Nov 8, 2025Updated 3 months ago
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆32Nov 4, 2024Updated last year
- ☆20May 30, 2024Updated last year
- 🌟Official code of our AAAI26 paper 🔍WebFilter☆37Nov 9, 2025Updated 3 months ago
- WeGeFT: Weight‑Generative Fine‑Tuning for Multi‑Faceted Efficient Adaptation of Large Models☆22Jul 10, 2025Updated 7 months ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆101Apr 10, 2024Updated last year
- Official Pytorch Implementation of Self-emerging Token Labeling☆35Mar 27, 2024Updated last year
- A family of compressed models obtained via pruning and knowledge distillation☆368Nov 6, 2025Updated 3 months ago
- [ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"☆446Oct 16, 2024Updated last year
- [NAACL 2024] Making Language Models Better Tool Learners with Execution Feedback☆43Mar 14, 2024Updated last year
- Official Repository for Task-Circuit Quantization☆24Jun 1, 2025Updated 9 months ago
- ☆146May 23, 2024Updated last year
- [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement☆193Mar 25, 2024Updated last year
- Model Stock: All we need is just a few fine-tuned models☆129Aug 9, 2025Updated 6 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆103Oct 22, 2025Updated 4 months ago
- [ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning☆665Jun 1, 2024Updated last year
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization☆172Nov 26, 2025Updated 3 months ago
- Linear Attention Sequence Parallelism (LASP)☆88Jun 4, 2024Updated last year
- ☆191Sep 26, 2024Updated last year