sail-sg / LightTrans
The official implementation of "LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation"
☆19Updated 2 weeks ago
Alternatives and similar repositories for LightTrans
Users that are interested in LightTrans are comparing it to the libraries listed below
Sorting:
- ☆15Updated 3 weeks ago
- ☆39Updated last month
- ☆78Updated 3 weeks ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆67Updated 2 months ago
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆45Updated 6 months ago
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆77Updated 6 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆25Updated 5 months ago
- The official repository of 'Unnatural Language Are Not Bugs but Features for LLMs'☆17Updated 2 months ago
- Codebase for decoding compressed trust.☆23Updated last year
- ☆82Updated this week
- LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification☆52Updated 2 months ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆44Updated 6 months ago
- ☆20Updated 2 months ago
- ☆99Updated last week
- ☆22Updated 10 months ago
- ☆16Updated last month
- ☆18Updated 5 months ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆26Updated 2 months ago
- PyTorch implementation of StableMask (ICML'24)☆12Updated 10 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆72Updated 6 months ago
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆45Updated this week
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆37Updated last year
- ☆15Updated 6 months ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆138Updated last month
- ☆17Updated 4 months ago
- The rule-based evaluation subset and code implementation of Omni-MATH☆21Updated 4 months ago
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"☆36Updated 11 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆21Updated 2 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆34Updated 3 weeks ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆37Updated 10 months ago