WisdomShell / RewardAnythingLinks
RewardAnything: Generalizable Principle-Following Reward Models
☆44Updated 4 months ago
Alternatives and similar repositories for RewardAnything
Users that are interested in RewardAnything are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆40Updated 7 months ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆140Updated 3 months ago
- ☆50Updated 2 months ago
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]☆32Updated last month
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆68Updated 11 months ago
- ☆156Updated last week
- ☆38Updated 2 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆38Updated last year
- Large Language Models Can Self-Improve in Long-context Reasoning☆73Updated 10 months ago
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆91Updated last year
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆126Updated 5 months ago
- A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models☆26Updated 10 months ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆47Updated 2 weeks ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆51Updated 4 months ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆106Updated 6 months ago
- ☆50Updated 8 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆65Updated 4 months ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆89Updated 6 months ago
- A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…☆114Updated 3 weeks ago
- ☆50Updated 11 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆69Updated 3 months ago
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆56Updated 4 months ago
- ☆99Updated last week
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆114Updated 5 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆62Updated 10 months ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆35Updated last year
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆75Updated 4 months ago
- Offical Repository of "AtomThink: Multimodal Slow Thinking with Atomic Step Reasoning"☆57Updated 2 months ago
- Official Implementation for the paper "Integrative Decoding: Improving Factuality via Implicit Self-consistency"☆31Updated 6 months ago
- The code and data for the paper JiuZhang3.0☆49Updated last year