Quentin-Anthony / torch-profiling-tutorialLinks

☆441

Alternatives and similar repositories for torch-profiling-tutorial

Users that are interested in torch-profiling-tutorial are comparing it to the libraries listed below

Sorting:

tugot17 / pmpp
Complete solutions to the Programming Massively Parallel Processors Edition 4
☆444Updated last month
MarioSieg / magnetron
(WIP) A small but powerful, homemade PyTorch from scratch.
☆554Updated this week
xjdr-alt / simple_transformer
Simple Transformer in Jax
☆138Updated last year
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆188Updated 2 months ago
LambdaLabsML / distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
☆460Updated 5 months ago
google-deepmind / nanodo
☆274Updated last year
clu0 / unet.cu
UNet diffusion model in pure CUDA
☆612Updated last year
EleutherAI / cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
☆808Updated 2 weeks ago
Laz4rz / GPT-2
Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish
☆172Updated last year
rkinas / triton-resources
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
☆382Updated 4 months ago
francoisfleuret / dlc
☆42Updated 7 months ago
kvfrans / jax-diffusion-transformer
Implementation of Diffusion Transformer (DiT) in JAX
☆280Updated last year
jax-ml / scaling-book
Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs
☆440Updated last week
naklecha / llm-inference-optimizations-explained
in this repository, i'm going to implement increasingly complex llm inference optimizations
☆64Updated 2 months ago
smolorg / smolgrad
small auto-grad engine inspired from Karpathy's micrograd and PyTorch
☆274Updated 8 months ago
Maharshi-Pandya / cudacodes
Learnings and programs related to CUDA
☆414Updated last month
srush / Autodiff-Puzzles
☆443Updated 9 months ago
rwitten / HighPerfLLMs2024
☆516Updated last year
marin-community / marin
☆336Updated this week
pytorch-labs / monarch
PyTorch Single Controller
☆341Updated this week
microsoft / ArchScale
Simple & Scalable Pretraining for Neural Architecture Research
☆277Updated last week
LeonGuertler / UnstableBaselines
☆94Updated this week
EurekaLabsAI / tensor
The Tensor (or Array)
☆441Updated 11 months ago
gpu-mode / profiling-cuda-in-torch
☆162Updated last year
brendanhogan / DeepSeekRL-Extended
Exploring Applications of GRPO
☆244Updated 3 weeks ago
srush / Transformer-Puzzles
Puzzles for exploring transformers
☆355Updated 2 years ago
joey00072 / Tinytorch
A really tiny autograd engine
☆95Updated 2 months ago
BlackHC / neural_net_checklist
☆150Updated 11 months ago
McGill-NLP / nano-aha-moment
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
☆506Updated 3 weeks ago
VatsaDev / NanoPoor
NanoGPT-speedrunning for the poor T4 enjoyers
☆68Updated 3 months ago