Reinforcing General Reasoning without Verifiers
☆96Jun 24, 2025Updated 8 months ago
Alternatives and similar repositories for VeriFree
Users that are interested in VeriFree are comparing it to the libraries listed below
Sorting:
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆52Jul 15, 2025Updated 7 months ago
- Tiny evaluation of leading LLMs on competitive programming problems☆14Nov 28, 2024Updated last year
- Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)☆65Jan 11, 2025Updated last year
- ☆52Oct 23, 2023Updated 2 years ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆183Jul 23, 2025Updated 7 months ago
- ☆46Jun 24, 2025Updated 8 months ago
- ☆74Jun 28, 2025Updated 8 months ago
- 3D Scene Flow Estimation☆14Sep 24, 2025Updated 5 months ago
- Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.☆11Apr 5, 2023Updated 2 years ago
- ☆10Oct 20, 2023Updated 2 years ago
- The official implement of paper "Does Federated Learning Really Need Backpropagation?"☆23Feb 9, 2023Updated 3 years ago
- A repo for open research on building large reasoning models☆140Updated this week
- Code for the paper "Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making"☆28Jul 11, 2024Updated last year
- An Ultra-Long Output Reinforcement Learning Approach☆23Jul 31, 2025Updated 7 months ago
- [ECCV 2024] Official Implementation of "Appearance-Based Refinement for Object-Centric Motion Segmentation" Junyu Xie, Weidi Xie, Andrew …☆13Oct 23, 2024Updated last year
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆32Jan 23, 2025Updated last year
- Code for "Variational Reasoning for Language Models"☆56Sep 29, 2025Updated 5 months ago
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆25Oct 7, 2025Updated 4 months ago
- ☆12May 5, 2024Updated last year
- Rebuttal code for SEGS-SLAM ICCV 2025☆17Jun 30, 2025Updated 8 months ago
- 🔱 Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs☆71Mar 21, 2025Updated 11 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆36Apr 14, 2025Updated 10 months ago
- 🤖 Complete reproduction of 'AlphaGo Moment for Model Architecture Discovery' using MLX-LM instead of GPT-4. Autonomous neural architectu…☆27Jul 27, 2025Updated 7 months ago
- ☆14Oct 28, 2023Updated 2 years ago
- ☆19May 20, 2025Updated 9 months ago
- ☆27Jul 18, 2025Updated 7 months ago
- ☆17Mar 23, 2025Updated 11 months ago
- ☆16Apr 30, 2024Updated last year
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆633Jan 29, 2026Updated last month
- ☆14May 31, 2022Updated 3 years ago
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆31Jun 5, 2025Updated 9 months ago
- ☆33Feb 4, 2026Updated last month
- implementation of paper "Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners"☆20Aug 17, 2023Updated 2 years ago
- Collaborative Dynamic 3D Scene Graphs for Open-Vocabulary Urban Scene Understanding☆31Dec 23, 2025Updated 2 months ago
- The official repository of 'Unnatural Language Are Not Bugs but Features for LLMs'☆24May 20, 2025Updated 9 months ago
- [NeurIPS 2025] The implementation of paper "On Reasoning Strength Planning in Large Reasoning Models"☆30Jul 6, 2025Updated 8 months ago
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆28Mar 1, 2025Updated last year
- Code for the paper "Efficient Dataset Distillation using Random Feature Approximation"☆37Feb 24, 2023Updated 3 years ago
- [ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast☆118Mar 26, 2024Updated last year