xianminx / mooc-cs294-llm-agentsLinks
CS294/194-196 Large Language Model Agents
☆35Updated 11 months ago
Alternatives and similar repositories for mooc-cs294-llm-agents
Users that are interested in mooc-cs294-llm-agents are comparing it to the libraries listed below
Sorting:
- Notes and commented code for RLHF (PPO)☆118Updated last year
- ☆399Updated 11 months ago
- [EMNLP 2025 Demo] TinyScientist: A Lightweight Framework for Building Research Agents☆119Updated 3 weeks ago
- [ICML 2025] ResearchTown: Simulator of Human Research Community☆183Updated this week
- ☆99Updated last year
- This is a survey of research on AI scientists, AI researchers, AI engineers, and a series of AI-driven research studies☆161Updated last month
- ☆78Updated 4 months ago
- Advanced NLP, Spring 2025 https://cmu-l3.github.io/anlp-spring2025/☆68Updated 8 months ago
- A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning mate…☆306Updated 9 months ago
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"☆161Updated last month
- A RL Framework for multi LLM agent system☆69Updated last week
- Official implementation of paper "On the Diagram of Thought" (https://arxiv.org/abs/2409.10038)☆188Updated 3 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆135Updated last year
- ☆68Updated last year
- This is the reading list for the survey "A Survey on the Optimization of LLM-based Agents ". We will keep adding papers and improving the…☆172Updated 4 months ago
- ☆459Updated 3 months ago
- nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)☆125Updated 6 months ago
- Repository for Zochi's Research☆289Updated last week
- ☆427Updated 4 months ago
- ☆75Updated 6 months ago
- minimal GRPO implementation from scratch☆99Updated 8 months ago
- ☆197Updated 4 months ago
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.☆139Updated last year
- Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".☆275Updated 9 months ago
- A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)☆192Updated 3 months ago
- A brief and partial summary of RLHF algorithms.☆139Updated 8 months ago
- ☆193Updated 3 months ago
- Tina: Tiny Reasoning Models via LoRA☆308Updated 2 months ago
- ☆698Updated last month
- A Telegram bot to recommend arXiv papers☆289Updated 2 weeks ago