openai / safety-rbr-code-and-dataLinks
Code and example data for the paper: Rule Based Rewards for Language Model Safety
☆187Updated 10 months ago
Alternatives and similar repositories for safety-rbr-code-and-data
Users that are interested in safety-rbr-code-and-data are comparing it to the libraries listed below
Sorting:
- Self-Alignment with Principle-Following Reward Models☆161Updated 3 weeks ago
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆155Updated 2 weeks ago
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆151Updated last month
- General Reasoner: Advancing LLM Reasoning Across All Domains☆117Updated this week
- ☆173Updated 2 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆207Updated 3 weeks ago
- ☆97Updated 11 months ago
- Repo of paper "Free Process Rewards without Process Labels"☆149Updated 2 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆120Updated 8 months ago
- ☆174Updated last month
- ☆201Updated 3 months ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆102Updated 4 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆223Updated 7 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆184Updated 2 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆136Updated 6 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆136Updated 8 months ago
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆73Updated 2 weeks ago
- ☆113Updated 4 months ago
- [ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following☆127Updated 10 months ago
- ☆293Updated this week
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆107Updated 2 weeks ago
- Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024☆128Updated 3 months ago
- ☆81Updated 6 months ago
- A Comprehensive Survey on Long Context Language Modeling☆147Updated 2 weeks ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆220Updated last year
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆121Updated 2 months ago
- ☆127Updated 3 weeks ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]☆106Updated 3 months ago
- Critique-out-Loud Reward Models☆66Updated 7 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆106Updated last year