The official implementation of "LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation"
☆22Apr 22, 2025Updated 10 months ago
Alternatives and similar repositories for LightTrans
Users that are interested in LightTrans are comparing it to the libraries listed below
Sorting:
- ☆20Apr 16, 2025Updated 10 months ago
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆51Oct 18, 2024Updated last year
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆74Jul 14, 2025Updated 7 months ago
- [ArXiv 2025] Denial-of-Service Poisoning Attacks on Large Language Models☆23Oct 22, 2024Updated last year
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆59Jan 5, 2026Updated last month
- triton ver of gqa flash attn, based on the tutorial☆12Aug 4, 2024Updated last year
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆47Apr 15, 2025Updated 10 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated 2 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆134Mar 21, 2025Updated 11 months ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆40Feb 13, 2025Updated last year
- The code for the paper "Efficient Self-Supervised Video Hashing with Selective State Spaces" (AAAI'25).☆22Aug 2, 2025Updated 7 months ago
- The official repository of 'Unnatural Language Are Not Bugs but Features for LLMs'☆24May 20, 2025Updated 9 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆51Jul 15, 2025Updated 7 months ago
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆84Oct 23, 2024Updated last year
- ☆20Dec 24, 2024Updated last year
- Code of the paper: Finetuning Text-to-Image Diffusion Models for Fairness☆45Apr 26, 2024Updated last year
- ☆33Apr 22, 2025Updated 10 months ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- [TMLR 2025] On Memorization in Diffusion Models☆31Oct 5, 2023Updated 2 years ago
- [arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies☆59Feb 6, 2026Updated 3 weeks ago
- Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)☆37Jan 23, 2024Updated 2 years ago
- ☆18Jun 10, 2025Updated 8 months ago
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- xKV: Cross-Layer SVD for KV-Cache Compression☆44Nov 30, 2025Updated 3 months ago
- The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)☆80Apr 23, 2025Updated 10 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆36Apr 14, 2025Updated 10 months ago
- Graph Diffusion Policy Optimization☆42Mar 17, 2024Updated last year
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs☆30Updated this week
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 5 months ago
- ☆11Jun 22, 2025Updated 8 months ago
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 4 months ago
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆29Oct 23, 2025Updated 4 months ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆89Sep 26, 2024Updated last year
- [ICLR 2025] IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆37Nov 27, 2024Updated last year
- [ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling☆51Jul 15, 2025Updated 7 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆105Sep 18, 2025Updated 5 months ago
- [CVPR 2026] Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"☆82Feb 13, 2026Updated 2 weeks ago