SusCom-Lab / ZSMergeLinks
☆17Updated 3 weeks ago
Alternatives and similar repositories for ZSMerge
Users that are interested in ZSMerge are comparing it to the libraries listed below
Sorting:
- FlexAttention w/ FlashAttention3 Support☆26Updated 8 months ago
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆32Updated 10 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆28Updated 9 months ago
- GoldFinch and other hybrid transformer components☆45Updated 11 months ago
- A repository for research on medium sized language models.☆76Updated last year
- DPO, but faster 🚀☆43Updated 6 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆126Updated 6 months ago
- RWKV-7: Surpassing GPT☆91Updated 7 months ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆23Updated 7 months ago
- ☆47Updated 2 weeks ago
- Make triton easier☆46Updated last year
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆33Updated 3 months ago
- Latent Large Language Models☆18Updated 10 months ago
- Using FlexAttention to compute attention with different masking patterns☆44Updated 9 months ago
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆41Updated last month
- JAX Scalify: end-to-end scaled arithmetics☆16Updated 7 months ago
- Linear Attention Sequence Parallelism (LASP)☆84Updated last year
- The evaluation framework for training-free sparse attention in LLMs☆69Updated last week
- Beyond KV Caching: Shared Attention for Efficient LLMs☆19Updated 11 months ago
- ☆56Updated 3 months ago
- ☆50Updated last year
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference☆66Updated 6 months ago
- ☆14Updated last month
- Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"☆16Updated last month
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆26Updated 4 months ago
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated this week
- ☆35Updated last year
- Official Implementation of APB (ACL 2025 main)☆28Updated 4 months ago
- Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More☆31Updated last month
- ☆48Updated 4 months ago