tinnerhrhe/ROVER

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tinnerhrhe/ROVER)

tinnerhrhe / ROVER

An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards

☆36

Alternatives and similar repositories for ROVER

Users that are interested in ROVER are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

William030422 / Video-Sycophancy
View on GitHub
Implementation for paper Flattery in Motion: Benchmarking and Analyzing Sycophancy in Video-LLMs, which is accepted by ACL 2026 (main con…
☆16Oct 10, 2025Updated 9 months ago
hanningzhang / ER-PRM
View on GitHub
☆20Dec 14, 2024Updated last year
Kwai-Klear / RLEP
View on GitHub
RL with Experience Replay
☆58Jul 27, 2025Updated last year
idanshen / Value-Augmented-Sampling
View on GitHub
☆20May 16, 2024Updated 2 years ago
StarDewXXX / UltraHorizon
View on GitHub
Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios
☆27Sep 30, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
dbsxodud-11 / PAG
View on GitHub
Official Code for Learning to Sample Effective and Diverse Prompts for Text-to-Image Generation (CVPR 2025)
☆15Apr 2, 2025Updated last year
GAIR-NLP / lm-open-science-evaluation
View on GitHub
Reproducible and flexible LLM evaluations for scientific reasoning.
☆29Jul 23, 2025Updated last year
tml-epfl / long-is-more-for-alignment
View on GitHub
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]
☆21May 2, 2024Updated 2 years ago
sail-sg / feedback-conditional-policy
View on GitHub
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆65Jan 5, 2026Updated 6 months ago
CharlieMat / GFN4Rec
View on GitHub
Source code for paper "Generative Flow Network for Listwise Recommendation"
☆18Nov 8, 2024Updated last year
zaydzuhri / token-order-prediction
View on GitHub
Landing repository for the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"
☆48May 13, 2026Updated 2 months ago
bbartoldson / TBA
View on GitHub
Official implementation of TBA for async LLM post-training.
☆32Nov 5, 2025Updated 8 months ago
complex-reasoning / RPG
View on GitHub
[ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)
☆76Jun 29, 2026Updated last month
CharlieMat / KRLBenchmark
View on GitHub
Kuaishou Online RL Benchmark
☆19Oct 21, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ZHUWEI-hub / GUARD
View on GitHub
[ACL 2026] Dissecting Failure Dynamics in Large Language Model Reasoning
☆18Apr 17, 2026Updated 3 months ago
Hesse73 / RLVR-Directions
View on GitHub
Source Code for our ICLR'26 paper
☆17Feb 22, 2026Updated 5 months ago
glorgao / SelectiveDPO
View on GitHub
Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples
☆47Jul 16, 2025Updated last year
stepfun-ai / PaCoRe
View on GitHub
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning
☆338Feb 5, 2026Updated 5 months ago
microsoft / EpiCoder
View on GitHub
Implementation for "EpiCoder: Encompassing Diversity and Complexity in Code Generation" (ICML 2025)
☆27May 16, 2025Updated last year
vincentamato / mlx-esm-2
View on GitHub
An MLX implementation of Meta AI's ESM-2 protein language model
☆16Aug 16, 2025Updated 11 months ago
pipixiaqishi1 / SAM-E
View on GitHub
☆53Oct 9, 2024Updated last year
shangshang-wang / Resa
View on GitHub
Resa: Transparent Reasoning Models via SAEs
☆50Sep 23, 2025Updated 10 months ago
bcml-labs / rosa-plus
View on GitHub
ROSA+: RWKV's ROSA implementation with fallback statistical predictor
☆36Oct 13, 2025Updated 9 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
nblt / RWP
View on GitHub
☆11Dec 8, 2022Updated 3 years ago
wutaiqiang / awesome-GNN2MLP-distillation
View on GitHub
Learning MLPs to replace GNN
☆10Jun 3, 2023Updated 3 years ago
leroy9472 / InMind
View on GitHub
☆15Nov 18, 2025Updated 8 months ago
zhangsichengsjtu / AFPQ
View on GitHub
AFPQ code implementation
☆23Nov 6, 2023Updated 2 years ago
Sundiasy / TopoDIM
View on GitHub
[ACL26 Findings] TopoDIM: One-shot Topology Generation of Diverse Interaction Modes for Multi-Agent Systems
☆19Jan 19, 2026Updated 6 months ago
stepfun-ai / StepDeepResearch
View on GitHub
Step-DeepResearch
☆570Mar 24, 2026Updated 4 months ago
bitvis2021 / HiTailor
View on GitHub
The implementation of the complex table visualizations
☆17Jan 2, 2024Updated 2 years ago
cassidylaidlaw / effective-horizon
View on GitHub
Code and data for the paper "Bridging RL Theory and Practice with the Effective Horizon"
☆50Jun 26, 2024Updated 2 years ago
sail-sg / variational-reasoning
View on GitHub
Code for "Variational Reasoning for Language Models"
☆60Sep 29, 2025Updated 10 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
sdan / nanoEBM
View on GitHub
minimal Energy-based transformer
☆44Dec 11, 2025Updated 7 months ago
mukhal / ThinkPRM
View on GitHub
[TMLR] Process Reward Models That Think
☆90Nov 29, 2025Updated 8 months ago
deeplearning-wisc / args
View on GitHub
☆47Feb 8, 2024Updated 2 years ago
zzhang393 / DataMosaic-1.0
View on GitHub
DataMosaic: Explainable and Verifiable Document-Based Data Analytics
☆20Jun 30, 2025Updated last year
Singularity0104 / equilibrium-planner
View on GitHub
[ICML 2025] Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling
☆13May 5, 2025Updated last year
ling-pan / GAFN
View on GitHub
☆25Mar 26, 2024Updated 2 years ago
allenai / AskOlmo
View on GitHub
☆15Nov 19, 2025Updated 8 months ago