thu-nics / R2RLinks
The official code implementation for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing"
☆24Updated this week
Alternatives and similar repositories for R2R
Users that are interested in R2R are comparing it to the libraries listed below
Sorting:
- ☆74Updated 2 weeks ago
- ☆36Updated 2 weeks ago
- This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality"☆47Updated 2 months ago
- The official repo of continuous speculative decoding☆26Updated 2 months ago
- [ICLR 2025] Official PyTorch implementation of "Forgetting Transformer: Softmax Attention with a Forget Gate"☆104Updated 3 weeks ago
- A WebUI for Side-by-Side Comparison of Media (Images/Videos) Across Multiple Folders☆24Updated 3 months ago
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆21Updated 5 months ago
- ☆31Updated last month
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆31Updated 3 months ago
- ☆111Updated last week
- ☆47Updated 2 months ago
- Triton implement of bi-directional (non-causal) linear attention☆48Updated 4 months ago
- the official repo for "D-AR: Diffusion via Autoregressive Models"☆30Updated last week
- ☆13Updated 2 months ago
- Official Implementation of Muddit [Meissonic II]: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model.☆47Updated last week
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆75Updated 5 months ago
- Fast-Slow Thinking for Large Vision-Language Model Reasoning☆14Updated last month
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆25Updated last month
- [ICML 2025] Fourier Position Embedding: Enhancing Attention’s Periodic Extension for Length Generalization☆70Updated this week
- TerDiT: Ternary Diffusion Models with Transformers☆71Updated 11 months ago
- VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆38Updated last week
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…☆72Updated this week
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆60Updated last week
- Autoregressive Image Generation with Randomized Parallel Decoding☆63Updated 2 months ago
- [ICLR 2025] Mixture Compressor for Mixture-of-Experts LLMs Gains More☆45Updated 3 months ago
- VidKV: Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models☆19Updated 2 months ago
- GIFT: Generative Interpretable Fine-Tuning☆20Updated 7 months ago
- Code for Draft Attention☆32Updated 2 weeks ago
- ✈️ Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints☆67Updated 2 months ago
- Official implementation of "Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization"☆77Updated last year