xyliu-cs / RISELinks
[NeurIPS'25] Official Implementation of RISE (Reinforcing Reasoning with Self-Verification)
☆30Updated 2 months ago
Alternatives and similar repositories for RISE
Users that are interested in RISE are comparing it to the libraries listed below
Sorting:
- ☆33Updated last month
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆69Updated 5 months ago
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆63Updated last year
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆114Updated 5 months ago
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆90Updated last month
- The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".☆83Updated last year
- Reasoning Activation in LLMs via Small Model Transfer (NeurIPS 2025)☆19Updated 2 weeks ago
- Multilingual safety benchmark for Large Language Models☆53Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆62Updated last year
- [NeurIPS 2025 Spotlight] ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆130Updated last month
- ☆28Updated last week
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆71Updated last year
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…☆29Updated 11 months ago
- Reproducing R1 for Code with Reliable Rewards☆262Updated 5 months ago
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆154Updated last year
- ☆17Updated last year
- [LREC-COLING'24] HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization☆38Updated 7 months ago
- Training and Benchmarking LLMs for Code Preference.☆36Updated 11 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆131Updated 7 months ago
- The Paper List on Data Contamination for Large Language Models Evaluation.☆102Updated 2 months ago
- [ICSE'25] Aligning the Objective of LLM-based Program Repair☆20Updated 7 months ago
- [NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"☆115Updated 2 months ago
- ☆12Updated 8 months ago
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆62Updated 10 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆83Updated 7 months ago
- [NeurIPS 2025 D&B] 🚀 SWE-bench Goes Live!☆129Updated this week
- e☆41Updated 6 months ago
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆46Updated 4 months ago
- ☆54Updated last year
- A Comprehensive Benchmark for Software Development.☆116Updated last year