stanford-cs336 / spring2025-lecturesLinks
☆477Updated 2 weeks ago
Alternatives and similar repositories for spring2025-lectures
Users that are interested in spring2025-lectures are comparing it to the libraries listed below
Sorting:
- Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch☆174Updated 2 months ago
- ☆298Updated 6 months ago
- Notes and commented code for RLHF (PPO)☆96Updated last year
- ☆174Updated 5 months ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆476Updated last month
- ☆193Updated 4 months ago
- TTRL: Test-Time Reinforcement Learning☆650Updated 2 weeks ago
- Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models☆464Updated last week
- A bibliography and survey of the papers surrounding o1☆1,199Updated 7 months ago
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆290Updated last week
- ☆89Updated 9 months ago
- Advanced NLP, Spring 2025 https://cmu-l3.github.io/anlp-spring2025/☆55Updated 2 months ago
- ☆782Updated last month
- ☆570Updated 2 months ago
- An extension of the nanoGPT repository for training small MOE models.☆152Updated 3 months ago
- ☆343Updated 2 months ago
- This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov☆1,796Updated last month
- PyTorch building blocks for the OLMo ecosystem☆238Updated this week
- Understanding R1-Zero-Like Training: A Critical Perspective☆991Updated last month
- A collection of 150+ surveys on LLMs☆301Updated 4 months ago
- A project to improve skills of large language models☆429Updated this week
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆395Updated last month
- SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning☆422Updated this week
- ☆300Updated 3 weeks ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆1,548Updated 3 weeks ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆339Updated 6 months ago
- Textbook on reinforcement learning from human feedback☆1,052Updated this week
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆519Updated this week
- Awesome Reasoning LLM Tutorial/Survey/Guide☆1,781Updated last week
- An ML Systems Onboarding list☆816Updated 5 months ago