stanford-cs336 / spring2025-lectures
☆47Updated last week
Alternatives and similar repositories for spring2025-lectures:
Users that are interested in spring2025-lectures are comparing it to the libraries listed below
- Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch☆51Updated last week
- ☆169Updated 2 months ago
- minimal GRPO implementation from scratch☆85Updated last month
- An extension of the nanoGPT repository for training small MOE models.☆131Updated last month
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆178Updated this week
- ☆28Updated 5 months ago
- Notes on Direct Preference Optimization☆19Updated last year
- Fine-tune an LLM to perform batch inference and online serving.☆109Updated last week
- ☆85Updated 7 months ago
- ☆254Updated 4 months ago
- Best practices & guides on how to write distributed pytorch training code☆401Updated 2 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆135Updated last month
- NanoGPT-speedrunning for the poor T4 enjoyers☆62Updated this week
- ☆45Updated 3 weeks ago
- RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct☆28Updated 2 months ago
- ☆128Updated 3 weeks ago
- ☆47Updated 7 months ago
- Survey: A collection of AWESOME papers and resources on the latest research in Mixture of Experts.☆111Updated 8 months ago
- ☆45Updated last month
- ⏰ AI conference deadline countdowns☆254Updated last month
- PyTorch building blocks for the OLMo ecosystem☆197Updated this week
- ☆19Updated last week
- ☆40Updated 11 months ago
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆167Updated 3 weeks ago
- ☆37Updated last year
- code for training & evaluating Contextual Document Embedding models☆181Updated last week
- making the official triton tutorials actually comprehensible☆26Updated last month
- Code for studying the super weight in LLM☆98Updated 4 months ago
- Code for NeurIPS LLM Efficiency Challenge☆57Updated last year
- ☆64Updated 6 months ago