stanford-cs336 / assignment2-systemsLinks

Student version of Assignment 2 for Stanford CS336 - Language Modeling From Scratch

☆131

Alternatives and similar repositories for assignment2-systems

Users that are interested in assignment2-systems are comparing it to the libraries listed below

Sorting:

hao-ai-lab / cse234-w25-PA
☆44Updated 8 months ago
hkproj / triton-flash-attention
☆222Updated 11 months ago
evintunador / triton_docs_tutorials
making the official triton tutorials actually comprehensible
☆75Updated 3 months ago
stanford-cs336 / assignment5-alignment
☆82Updated 4 months ago
changjonathanc / flex-nano-vllm
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
☆313Updated last month
gpu-mode / ring-attention
ring-attention experiments
☆160Updated last year
gpu-mode / triton-index
Cataloging released Triton kernels.
☆274Updated 2 months ago
wolfecameron / nanoMoE
An extension of the nanoGPT repository for training small MOE models.
☆215Updated 8 months ago
NVIDIA / kvpress
LLM KV cache compression made easy
☆701Updated this week
mingyin0312 / RLFromScratch
☆463Updated 3 months ago
huggingface / picotron_tutorial
☆224Updated last week
deepseek-ai / LPLB
An early research stage expert-parallel load balancer for MoE models based on linear programming.
☆433Updated 2 weeks ago
ScalingIntelligence / KernelBench
KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA (+ more DSLs)
☆683Updated last week
radixark / miles
☆344Updated this week
rkinas / triton-resources
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
☆438Updated 8 months ago
MekkCyber / CutlassAcademy
A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS
☆244Updated 7 months ago
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆196Updated 6 months ago
gpu-mode / profiling-cuda-in-torch
☆177Updated last year
Deep-Learning-Profiling-Tools / triton-viz
☆257Updated this week
sgl-project / sglang-jax
JAX backend for SGL
☆187Updated this week
stanford-cs336 / assignment1-basics
Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch
☆976Updated 3 months ago
aryagxr / cuda
coding CUDA everyday!
☆71Updated 3 weeks ago
stanford-cs336 / spring2024-lectures
☆403Updated 11 months ago
gpu-mode / reference-kernels
Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!
☆164Updated last week
zinccat / Awesome-Triton-Kernels
Collection of kernels written in Triton language
☆172Updated 8 months ago
tilde-research / MoMoE-impl
Memory optimized Mixture of Experts
☆69Updated 4 months ago
microsoft / dion
Dion optimizer algorithm
☆395Updated 2 weeks ago
nil0x9 / flash-muon
Flash-Muon: An Efficient Implementation of Muon Optimizer
☆212Updated 5 months ago
NVIDIA / Star-Attention
Efficient LLM Inference over Long Sequences
☆392Updated 5 months ago
tspeterkim / paged-attention-minimal
a minimal cache manager for PagedAttention, on top of llama3.
☆126Updated last year