sdc17 / SwiReasoningLinks
[ICLR 2026] SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs
☆43Updated 3 months ago
Alternatives and similar repositories for SwiReasoning
Users that are interested in SwiReasoning are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆30Updated 7 months ago
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆88Updated 4 months ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆46Updated last year
- ☆63Updated 6 months ago
- Official Repository of LatentSeek☆76Updated 8 months ago
- ☆33Updated 2 months ago
- [AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data☆33Updated 10 months ago
- ☆45Updated last month
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆70Updated 6 months ago
- [ICLR 2026] Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs☆41Updated 8 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆36Updated 9 months ago
- [NeurIPS25 Spotlight] EMPO, A Fully Unsupervised RLVR Method☆94Updated 2 months ago
- ☆34Updated 9 months ago
- ☆204Updated last month
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆91Updated 11 months ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆13Updated last year
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆58Updated last week
- [NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"☆39Updated 6 months ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆76Updated 11 months ago
- ☆47Updated 10 months ago
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆93Updated last year
- Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning☆36Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆85Updated last year
- Code for Heima☆59Updated 9 months ago
- [COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆51Updated 10 months ago
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆53Updated 6 months ago
- ☆38Updated last year
- Official Repository of "Learning what reinforcement learning can't"☆79Updated last month
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆51Updated 6 months ago
- ☆44Updated 7 months ago