Reproducing R1 for Code with Reliable Rewards
☆12Apr 9, 2025Updated 10 months ago
Alternatives and similar repositories for code-r1
Users that are interested in code-r1 are comparing it to the libraries listed below
Sorting:
- Your efficient and accurate answer verification system for RL training.☆41Jun 23, 2025Updated 8 months ago
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆61Jan 28, 2026Updated last month
- Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts☆25Feb 23, 2024Updated 2 years ago
- Async pipelined version of Verl☆124Apr 8, 2025Updated 10 months ago
- [AAAI 2025]Automatically Generating Numerous Context-Driven SFT Data for LLMs across Diverse Granularity☆26Mar 17, 2025Updated 11 months ago
- Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"☆41Jun 24, 2025Updated 8 months ago
- ☆28Nov 10, 2025Updated 3 months ago
- [ICML 2025] Official resources of "KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search".☆34Dec 6, 2025Updated 2 months ago
- ☆30Dec 27, 2024Updated last year
- JudgeLRM: Large Reasoning Models as a Judge☆41Jan 29, 2026Updated 3 weeks ago
- ☆36Jul 7, 2025Updated 7 months ago
- ✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork☆311Sep 6, 2025Updated 5 months ago
- g2-MLP: State-of-the-Art Model for Node Classification on Graphs (PPI Dataset)☆10Nov 12, 2022Updated 3 years ago
- Collection of papers for scalable automated alignment.☆93Oct 22, 2024Updated last year
- [ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"☆10May 5, 2024Updated last year
- ☆42Sep 19, 2024Updated last year
- Implementation for EACL 2024 paper "Corpus-Steered Query Expansion with Large Language Models"☆12Mar 19, 2024Updated last year
- Language Models for Code Completion: a Practical Evaluation☆13Jan 19, 2024Updated 2 years ago
- ☆14Dec 1, 2025Updated 2 months ago
- Tutorial for Scala on Spark only☆12May 6, 2018Updated 7 years ago
- Dataset from Tip of the Tongue Known-Item Retrieval (2021) paper.☆12Nov 4, 2021Updated 4 years ago
- ☆10Oct 6, 2021Updated 4 years ago
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning☆57Oct 10, 2025Updated 4 months ago
- ACL Rolling Review website☆11Feb 9, 2026Updated 2 weeks ago
- Paper list for vision-language tracking☆15Nov 10, 2025Updated 3 months ago
- MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research☆22Sep 23, 2025Updated 5 months ago
- ☆12Mar 24, 2024Updated last year
- ☆14Mar 5, 2024Updated last year
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- Official Repo of "CIBench: Evaluation of LLMs as Code Interpreter "☆14Jul 19, 2024Updated last year
- data prep utilities for LLMs, using LLMs☆16Nov 7, 2023Updated 2 years ago
- 图神经网络在推荐系统的应用☆13Aug 26, 2021Updated 4 years ago
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆19Nov 4, 2025Updated 3 months ago
- ☆11Nov 24, 2021Updated 4 years ago
- A flask-style `route` decorator for django views☆12Dec 21, 2023Updated 2 years ago
- 一种基于属性和图神经网络的推荐算法——本科生毕设☆14Mar 20, 2021Updated 4 years ago
- ☆15Nov 23, 2023Updated 2 years ago
- ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation☆25Aug 24, 2025Updated 6 months ago
- Source code and data for ADEPT: A DEbiasing PrompT Framework (AAAI-23).☆15Dec 13, 2024Updated last year