TianshuoY / HKU-DASC7606-A1Links
☆25Updated last year
Alternatives and similar repositories for HKU-DASC7606-A1
Users that are interested in HKU-DASC7606-A1 are comparing it to the libraries listed below
Sorting:
- ☆15Updated last year
 - ☆13Updated 11 months ago
 - ☆1,083Updated last month
 - ☆21Updated 3 months ago
 - ICLR 2025 Agent-Related Papers☆73Updated 11 months ago
 - Large Language Model based Multi-Agents: A Survey of Progress and Challenges☆1,117Updated last year
 - Survey on LLM Agents (Published on CoLing 2025)☆409Updated last month
 - Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning☆848Updated 3 months ago
 - The official code of ARPO & AEPO☆748Updated last week
 - Latest Advances on System-2 Reasoning☆1,260Updated 4 months ago
 - 这是一个open-r1的复现项目,对0.5B、1.5B、3B、7B的qwen模型进行GRPO训练,观察到一些有趣的现象。☆47Updated 6 months ago
 - A repo lists papers related to LLM based agent☆2,083Updated 3 months ago
 - A Survey of Reinforcement Learning for Large Reasoning Models☆1,951Updated this week
 - This is the repository for the Tool Learning survey.☆451Updated 2 months ago
 - Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhi…☆636Updated last month
 - ☆414Updated 3 weeks ago
 - [ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"☆437Updated 3 weeks ago
 - MAD: The first work to explore Multi-Agent Debate with Large Language Models :D☆453Updated 9 months ago
 - ☆459Updated 3 months ago
 - verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in…☆1,117Updated 2 weeks ago
 - [NeurIPS 2025] TTRL: Test-Time Reinforcement Learning☆878Updated last month
 - ☆589Updated 2 weeks ago
 - Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个☆1,191Updated last year
 - Latest Advances on Long Chain-of-Thought Reasoning☆537Updated 3 months ago
 - 欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩🎓👨🎓☆888Updated this week
 - ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning☆1,230Updated 5 months ago
 - A collection on the recent reproduction papers and projects on DeepSeek-R1☆32Updated 8 months ago
 - MAT: Multi-modal Agent Tuning 🔥 ICLR 2025 (Spotlight)