alperiox / Compact-Language-Models-via-Pruning-and-Knowledge-Distillation
Unofficial implementation of https://arxiv.org/pdf/2407.14679
☆36Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for Compact-Language-Models-via-Pruning-and-Knowledge-Distillation
- ☆40Updated 2 weeks ago
- Prune transformer layers☆64Updated 5 months ago
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆13Updated 8 months ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆59Updated 3 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆53Updated 2 months ago
- ☆63Updated last month
- ☆122Updated 9 months ago
- Small and Efficient Mathematical Reasoning LLMs☆71Updated 9 months ago
- ☆87Updated 9 months ago
- Codebase accompanying the Summary of a Haystack paper.☆72Updated 2 months ago
- ☆46Updated 2 weeks ago
- ☆63Updated last month
- Set of scripts to finetune LLMs☆36Updated 7 months ago
- Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M cont…☆65Updated 6 months ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆52Updated last week
- Supercharge huggingface transformers with model parallelism.☆75Updated last month
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆29Updated 6 months ago
- The official repo for "LLoCo: Learning Long Contexts Offline"☆113Updated 5 months ago
- ☆62Updated 3 months ago
- ☆59Updated last month
- Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆31Updated 4 months ago
- This is the official repository for Inheritune.☆105Updated last month
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆43Updated 4 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆19Updated 2 months ago
- ☆45Updated 2 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆103Updated 6 months ago
- The official implementation of the paper "Demystifying the Compression of Mixture-of-Experts Through a Unified Framework".☆48Updated 3 weeks ago
- ☆49Updated last month
- ☆29Updated 4 months ago
- A repository for research on medium sized language models.☆74Updated 5 months ago