wantbook-book / SeRLLinks
SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data
☆14Updated last month
Alternatives and similar repositories for SeRL
Users that are interested in SeRL are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] Official implementation for paper "Can Graph Learning Improve Planning in LLM-based Agents?"☆142Updated 5 months ago
- Code repo for "LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners"☆48Updated 4 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆130Updated 8 months ago
- ☆188Updated 2 months ago
- A research repo for experiments about Reinforcement Finetuning☆52Updated 6 months ago
- A trend starts from "Chain of Thought Prompting Elicits Reasoning in Large Language Models".☆41Updated 2 years ago
- [ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correla…☆45Updated 5 months ago
- VeriGUI: Verifiable Long-Chain GUI Dataset☆81Updated last week
- ☆87Updated 6 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆56Updated 10 months ago
- A comprehensive collection of process reward models.☆111Updated 2 weeks ago
- RFTT: Reasoning with Reinforced Functional Token Tuning☆29Updated 4 months ago
- Repo of "Large Language Model-based Human-Agent Collaboration for Complex Task Solving(EMNLP2024 Findings)"