xianminx / mooc-cs294-llm-agentsLinks
CS294/194-196 Large Language Model Agents
☆21Updated 5 months ago
Alternatives and similar repositories for mooc-cs294-llm-agents
Users that are interested in mooc-cs294-llm-agents are comparing it to the libraries listed below
Sorting:
- ☆102Updated last month
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆82Updated 2 months ago
- Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing☆46Updated 4 months ago
- A brief and partial summary of RLHF algorithms.☆129Updated 3 months ago
- Curation of resources for LLM research, screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise de…☆54Updated 10 months ago
- nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)☆105Updated 3 weeks ago
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆74Updated last month
- Run TRex with PPO☆38Updated 3 weeks ago
- A collection of tricks and tools to speed up transformer models☆167Updated this week
- A Comprehensive Survey on Long Context Language Modeling☆147Updated this week
- A Telegram bot to recommend arXiv papers☆272Updated last month
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆73Updated this week
- ☆77Updated 2 months ago
- ☆231Updated last week
- 一个开源数学大模型项目,旨在探索大模型是否具有数学创造能力 ,以及大模型在前沿数学研究中的潜在能力。☆14Updated 3 weeks ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆86Updated last month
- Efficient Agent Training for Computer Use☆94Updated last week
- Notes and commented code for RLHF (PPO)☆96Updated last year
- This is a survey of research on AI scientists, AI researchers, AI engineers, and a series of AI-driven research studies☆64Updated 3 weeks ago
- Official Implementation of "Reasoning Language Models: A Blueprint"☆62Updated 3 months ago
- Advanced NLP, Spring 2025 https://cmu-l3.github.io/anlp-spring2025/☆53Updated 2 months ago
- ☆59Updated 10 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆129Updated last month
- ☆210Updated 2 weeks ago
- ☆151Updated this week
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆95Updated 2 months ago
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆257Updated this week
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Updated 7 months ago
- Code for the paper: "Learning to Reason without External Rewards"☆237Updated this week
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆180Updated 2 months ago