xianminx / mooc-cs294-llm-agents
CS294/194-196 Large Language Model Agents
☆16Updated 3 months ago
Alternatives and similar repositories for mooc-cs294-llm-agents:
Users that are interested in mooc-cs294-llm-agents are comparing it to the libraries listed below
- Curation of resources for LLM research, screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise de…☆49Updated 8 months ago
- A Comprehensive Survey on Long Context Language Modeling☆86Updated last week
- A repository sharing the literatures about large language models☆80Updated 2 weeks ago
- Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing☆34Updated 2 months ago
- Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️☆35Updated 10 months ago
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆119Updated 8 months ago
- This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Met…☆139Updated 6 months ago
- The blog, read report and code example for AGI/LLM related knowledge.☆36Updated last month
- 🎓Automatically Update agent Papers Daily using Github Actions (Update Every 12th hours)☆29Updated this week
- ☆60Updated last week
- ☆202Updated 11 months ago
- AI Alignment: A Comprehensive Survey☆133Updated last year
- This repository contains a LLM benchmark for the social deduction game `Resistance Avalon'☆101Updated last month
- CS294/194-196 Large Language Model Agents☆9Updated last month
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.☆17Updated last month
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆67Updated last week
- Reproducing R1 for Code with Reliable Rewards☆132Updated 3 weeks ago
- GitHub page for "Large Language Model-Brained GUI Agents: A Survey"☆136Updated 3 weeks ago
- A Comprehensive Benchmark for Software Development.☆100Updated 9 months ago
- Course notes for Cyber Security (THUCST 2023 Spring)☆26Updated last year
- What are learned in tiktoken?☆67Updated 10 months ago
- Notes and commented code for RLHF (PPO)☆77Updated last year
- ☆225Updated 3 months ago
- Official implementation of paper "On the Diagram of Thought" (https://arxiv.org/abs/2409.10038)☆177Updated 2 weeks ago
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆158Updated last week
- ☆60Updated 4 months ago
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.☆212Updated this week
- connecting humans and agents☆79Updated 3 months ago
- ☆46Updated 3 months ago
- Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆116Updated 3 weeks ago