NuoJohnChen / JudgeLRMLinks
☆27Updated last month
Alternatives and similar repositories for JudgeLRM
Users that are interested in JudgeLRM are comparing it to the libraries listed below
Sorting:
- Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆36Updated last week
- A Sober Look at Language Model Reasoning☆52Updated this week
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆43Updated 3 weeks ago
- ☆105Updated 2 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆69Updated 3 months ago
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"☆51Updated 10 months ago
- ☆17Updated 5 months ago
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆44Updated last month
- ☆22Updated 10 months ago
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆73Updated 7 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆70Updated 2 months ago
- Code for Heima☆43Updated last month
- Repo for "Z1: Efficient Test-time Scaling with Code"☆59Updated last month
- Codebase for Instruction Following without Instruction Tuning☆34Updated 8 months ago
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆64Updated 3 months ago
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆51Updated last week
- ☆89Updated last week
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆36Updated 3 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆46Updated last week
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆84Updated 7 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆36Updated last week
- [ACL 2025] Knowledge Unlearning for Large Language Models☆32Updated 3 weeks ago
- Official repository for Decentralized Arena via Collective LLM Intelligence☆13Updated 2 weeks ago
- ☆45Updated 3 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 3 months ago
- ☆40Updated 3 weeks ago
- ☆107Updated last week
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆88Updated 2 weeks ago
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient Training R1-like Reasoning Models☆49Updated this week
- Official code for Guiding Language Model Math Reasoning with Planning Tokens☆11Updated last year