TianshuoY / HKU-DASC7606-A1Links
☆25Updated 10 months ago
Alternatives and similar repositories for HKU-DASC7606-A1
Users that are interested in HKU-DASC7606-A1 are comparing it to the libraries listed below
Sorting:
- ☆15Updated 10 months ago
- Latest Advances on System-2 Reasoning☆1,214Updated 2 months ago
- Awesome RL Reasoning Recipes ("Triple R")☆768Updated last month
- ☆20Updated last week
- Awesome RL-based LLM Reasoning☆592Updated 3 weeks ago
- ICLR 2025 Agent-Related Papers☆71Updated 8 months ago
- Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning☆721Updated 2 weeks ago
- ☆13Updated 8 months ago
- Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个☆1,158Updated last year
- [ICML'24 Spotlight] "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"☆394Updated last month
- Generative AI Act II: Test Time Scaling Drives Cognition Engineering☆202Updated 3 months ago
- 《EasyOffer》(<大模型面经合集>)是针对LLM宝宝们量身打造的大模型暑期实习Offer指南,主要记录大模型暑期实习和秋招准备的一些常见大厂手撕代码、大厂面经经验、常见大厂思考题等;小白一个,正在学习ing......有问题各位大佬随时指正,希望大家都能拿到心仪Of…☆294Updated 4 months ago
- Survey on LLM Agents (Published on CoLing 2025)☆358Updated 3 months ago
- A very simple GRPO implement for reproducing r1-like LLM thinking.☆1,255Updated this week
- Latest Advances on Long Chain-of-Thought Reasoning☆470Updated 3 weeks ago
- TTRL: Test-Time Reinforcement Learning☆748Updated 3 weeks ago
- An example reproduction checklist for AAAI-26 submissions.☆106Updated last week
- Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey☆748Updated 3 weeks ago
- Large Language Model based Multi-Agents: A Survey of Progress and Challenges☆1,068Updated last year
- verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in…☆702Updated this week
- Building a comprehensive and handy list of papers for GUI agents☆456Updated last month
- ☆17Updated this week
- ☆78Updated 11 months ago
- ☆425Updated 2 weeks ago
- ☆310Updated 2 months ago
- O1 Replication Journey☆1,999Updated 6 months ago
- ☆545Updated 7 months ago
- ☆78Updated 11 months ago
- 欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩🎓👨🎓☆827Updated 3 weeks ago
- A collection on the recent reproduction papers and projects on DeepSeek-R1☆32Updated 5 months ago