Raj-08 / Q-FlowLinks
Complete Reinforcement Learning Toolkit for Large Language Models!
☆20Updated last month
Alternatives and similar repositories for Q-Flow
Users that are interested in Q-Flow are comparing it to the libraries listed below
Sorting:
- Natural Language Reinforcement Learning☆96Updated last month
- ☆32Updated 10 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆61Updated 7 months ago
- ☆20Updated 10 months ago
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆31Updated last year
- Dateset Reset Policy Optimization☆30Updated last year
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆49Updated last year
- ☆53Updated 7 months ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆29Updated 3 weeks ago
- ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning