xianminx / mooc-cs294-llm-agentsLinks
CS294/194-196 Large Language Model Agents
☆43Updated last year
Alternatives and similar repositories for mooc-cs294-llm-agents
Users that are interested in mooc-cs294-llm-agents are comparing it to the libraries listed below
Sorting:
- Notes and commented code for RLHF (PPO)☆124Updated last year
- ☆100Updated 6 months ago
- ☆409Updated last year
- nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)☆141Updated 8 months ago
- Advanced NLP, Spring 2025 https://cmu-l3.github.io/anlp-spring2025/☆71Updated 10 months ago
- ☆100Updated last year
- This is a survey of research on AI scientists, AI researchers, AI engineers, and a series of AI-driven research studies☆174Updated 2 months ago
- ☆70Updated last year
- A brief and partial summary of RLHF algorithms.☆143Updated 10 months ago
- ☆78Updated 8 months ago
- [ICML 2025] ResearchTown: Simulator of Human Research Community☆192Updated this week
- ☆466Updated 5 months ago
- Repository for Zochi's Research☆299Updated 2 months ago
- Solutions for CS224n (2022)☆72Updated last year
- ☆82Updated last year
- A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning mate…☆307Updated 11 months ago
- A RL Framework for multi LLM agent system☆91Updated last week
- Minimal hackable GRPO implementation☆319Updated 11 months ago
- NeurIPS 2024 tutorial on LLM Inference☆47Updated last year
- minimal GRPO implementation from scratch☆102Updated 10 months ago
- ☆210Updated 5 months ago
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆150Updated last year
- [EMNLP 2025 Demo] TinyScientist: A Lightweight Framework for Building Research Agents☆126Updated 2 months ago
- ☆213Updated 6 months ago
- Based on Nano-vLLM, a simple replication of vLLM with self-contained paged attention and flash attention implementation☆243Updated last week
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"☆166Updated 3 months ago
- ☆830Updated 3 months ago
- A Telegram bot to recommend arXiv papers☆302Updated 2 months ago
- Student version of Assignment 2 for Stanford CS336 - Language Modeling From Scratch☆162Updated 6 months ago
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.☆140Updated last year