sail-sg / VeriFreeView external linksLinks
Reinforcing General Reasoning without Verifiers
☆97Jun 24, 2025Updated 7 months ago
Alternatives and similar repositories for VeriFree
Users that are interested in VeriFree are comparing it to the libraries listed below
Sorting:
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆51Jul 15, 2025Updated 6 months ago
- Tiny evaluation of leading LLMs on competitive programming problems☆14Nov 28, 2024Updated last year
- ☆51Oct 23, 2023Updated 2 years ago
- ☆75Jun 28, 2025Updated 7 months ago
- 3D Scene Flow Estimation☆15Sep 24, 2025Updated 4 months ago
- ☆10Oct 20, 2023Updated 2 years ago
- Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.☆11Apr 5, 2023Updated 2 years ago
- The official implement of paper "Does Federated Learning Really Need Backpropagation?"☆23Feb 9, 2023Updated 3 years ago
- A repo for open research on building large reasoning models☆136Jan 30, 2026Updated 2 weeks ago
- Code for the paper "Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making"☆28Jul 11, 2024Updated last year
- ☆24Feb 4, 2026Updated last week
- An Ultra-Long Output Reinforcement Learning Approach☆23Jul 31, 2025Updated 6 months ago
- ☆13Jul 25, 2023Updated 2 years ago
- Code for "Variational Reasoning for Language Models"☆56Sep 29, 2025Updated 4 months ago
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆24Oct 7, 2025Updated 4 months ago
- Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)☆37Jan 23, 2024Updated 2 years ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated last month
- Rebuttal code for SEGS-SLAM ICCV 2025☆18Jun 30, 2025Updated 7 months ago
- ☆13May 5, 2024Updated last year
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆14Jun 21, 2024Updated last year
- 🔱 Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs☆71Mar 21, 2025Updated 10 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆36Apr 14, 2025Updated 10 months ago
- ☆16Apr 30, 2024Updated last year
- Code and data for paper "(How) do Language Models Track State?"☆21Mar 31, 2025Updated 10 months ago
- Code for the SofT-GRPO algorithm on the LLM soft-thinking reasoning pattern.☆38Jan 2, 2026Updated last month
- 🤖 Complete reproduction of 'AlphaGo Moment for Model Architecture Discovery' using MLX-LM instead of GPT-4. Autonomous neural architectu…☆27Jul 27, 2025Updated 6 months ago
- ☆14Oct 28, 2023Updated 2 years ago
- ☆27Jul 18, 2025Updated 6 months ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆627Jan 29, 2026Updated 2 weeks ago
- AnchorAttention: Improved attention for LLMs long-context training☆213Jan 15, 2025Updated last year
- Trending projects & awesome papers about data-centric llm studies.☆40May 20, 2025Updated 8 months ago
- implementation of paper "Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners"☆20Aug 17, 2023Updated 2 years ago
- The official repository of 'Unnatural Language Are Not Bugs but Features for LLMs'☆24May 20, 2025Updated 8 months ago
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆31Jun 5, 2025Updated 8 months ago
- [ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast☆118Mar 26, 2024Updated last year
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆218Nov 27, 2025Updated 2 months ago
- ☆19Feb 25, 2024Updated last year
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆84Oct 23, 2024Updated last year
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,205Aug 27, 2025Updated 5 months ago