stanford-cs336 / spring2025-lectures
☆108Updated last week
Alternatives and similar repositories for spring2025-lectures
Users that are interested in spring2025-lectures are comparing it to the libraries listed below
Sorting:
- Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch☆67Updated last month
- An extension of the nanoGPT repository for training small MOE models.☆142Updated 2 months ago
- ☆274Updated 4 months ago
- ☆85Updated 7 months ago
- A brief and partial summary of RLHF algorithms.☆128Updated 2 months ago
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆143Updated last week
- ☆186Updated 3 months ago
- minimal GRPO implementation from scratch☆90Updated 2 months ago
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.☆116Updated 8 months ago
- Tina: Tiny Reasoning Models via LoRA☆192Updated 3 weeks ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆294Updated last week
- Notes on Direct Preference Optimization☆19Updated last year
- PyTorch building blocks for the OLMo ecosystem☆212Updated this week
- Advanced NLP, Spring 2025 https://cmu-l3.github.io/anlp-spring2025/☆51Updated last month
- Notes and commented code for RLHF (PPO)☆92Updated last year
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆445Updated last month
- SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning☆261Updated this week
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).☆230Updated last week
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆180Updated this week
- The HELMET Benchmark☆143Updated last month
- ☆177Updated 5 months ago
- NeurIPS 2024 tutorial on LLM Inference☆43Updated 5 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆323Updated 5 months ago
- ☆163Updated 4 months ago
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"☆224Updated this week
- Code for Paper: Learning Adaptive Parallel Reasoning with Language Models☆77Updated 3 weeks ago
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆169Updated last month
- LLM-Merging: Building LLMs Efficiently through Merging☆197Updated 7 months ago
- Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"☆131Updated last week
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆99Updated 3 weeks ago