ritzz-ai / PACSLinks
☆30Updated 2 months ago
Alternatives and similar repositories for PACS
Users that are interested in PACS are comparing it to the libraries listed below
Sorting:
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆123Updated 7 months ago
- instruction-following benchmark for large reasoning models☆45Updated 3 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆25Updated 3 months ago
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"☆54Updated 2 months ago
- ☆50Updated last month
- ☆38Updated 3 months ago
- [AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆90Updated 3 weeks ago
- The official repository of the Omni-MATH benchmark.☆88Updated 11 months ago
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆84Updated 9 months ago
- ☆29Updated 5 months ago
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆46Updated 4 months ago
- ☆69Updated 5 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆180Updated 4 months ago
- ☆63Updated last month
- ☆36Updated last month
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆43Updated 3 months ago
- ☆64Updated 5 months ago
- Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping☆60Updated 6 months ago
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆41Updated last month
- SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks☆107Updated last week
- The rule-based evaluation subset and code implementation of Omni-MATH☆25Updated 11 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆79Updated 2 months ago
- The demo, code and data of FollowRAG☆75Updated 5 months ago