SSRL: Self-Search Reinforcement Learning
☆207Aug 20, 2025Updated 6 months ago
Alternatives and similar repositories for SSRL
Users that are interested in SSRL are comparing it to the libraries listed below
Sorting:
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆31Updated this week
- ☆54Jan 15, 2026Updated last month
- ☆20Jul 23, 2025Updated 7 months ago
- ☆31Sep 19, 2025Updated 5 months ago
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆62Oct 24, 2025Updated 4 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Apr 9, 2025Updated 10 months ago
- Code for KaLM-Embedding models☆114Jun 30, 2025Updated 8 months ago
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated 10 months ago
- ☆69Jan 18, 2026Updated last month
- ☆28Feb 7, 2025Updated last year
- ☆33Jul 15, 2025Updated 7 months ago
- Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging dif…☆28Jan 21, 2025Updated last year
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆256Aug 12, 2025Updated 6 months ago
- A Structured Span Selector (NAACL 2022). A structured span selector with a WCFG for span selection tasks (coreference resolution, semanti…☆21Jul 11, 2022Updated 3 years ago
- Test-time Scaling for VAR models☆31Sep 19, 2025Updated 5 months ago
- The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"☆107Sep 29, 2025Updated 5 months ago
- ☆46Mar 4, 2025Updated last year
- ☆136Jan 26, 2026Updated last month
- Long Context Extension and Generalization in LLMs☆63Sep 21, 2024Updated last year
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Jun 1, 2025Updated 9 months ago
- Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'☆27May 16, 2025Updated 9 months ago
- Automatic Thief Detection via CCTV with Alarm System and Perpetrator Image Capture using YOLOv5 + ROI. This project utilizes computer vis…☆14Oct 21, 2024Updated last year
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆81Dec 25, 2025Updated 2 months ago
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆544Sep 8, 2025Updated 5 months ago
- The open-source code of MetaStone-S1.☆106Aug 1, 2025Updated 7 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆183Jul 23, 2025Updated 7 months ago
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆69Dec 8, 2025Updated 2 months ago
- ☆46Jun 24, 2025Updated 8 months ago
- ☆56Feb 6, 2026Updated 3 weeks ago
- A comprehensive benchmark for evaluating deep research agents on academic survey tasks☆50Sep 4, 2025Updated 6 months ago
- Codebase for Context-aware Meta-learned Loss Scaling (CaMeLS). https://arxiv.org/abs/2305.15076.☆25Jan 23, 2024Updated 2 years ago
- ☆78Jan 22, 2026Updated last month
- Code for paper: Optimizing Length Compression in Large Reasoning Models☆27Oct 20, 2025Updated 4 months ago
- ACE (Adaptive Code Evolution) is an AI-powered system for code analysis and optimization.☆12Nov 4, 2025Updated 4 months ago
- ☆15Jan 12, 2026Updated last month
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 8 months ago
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆19Nov 4, 2025Updated 4 months ago
- ☆22Nov 18, 2025Updated 3 months ago
- Code for the paper Normalizing Flows are Capable Models for RL☆18Jun 3, 2025Updated 9 months ago