suneeta-mall / deep_learning_at_scaleLinks

Contains hands-on example code for [O'reilly book "Deep Learning At Scale"](https://www.oreilly.com/library/view/deep-learning-at/9781098145279/).

☆27

Alternatives and similar repositories for deep_learning_at_scale

Users that are interested in deep_learning_at_scale are comparing it to the libraries listed below

Sorting:

mlops-discord / gpu-optimization-workshop
Slides, notes, and materials for the workshop
☆328Updated last year
PacktPublishing / Accelerate-Model-Training-with-PyTorch-2.X
Accelerate Model Training with PyTorch 2.X, published by Packt
☆46Updated last year
anyscale / e2e-llm-workflows
Fine-tune an LLM to perform batch inference and online serving.
☆112Updated 2 months ago
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆188Updated 2 months ago
hkproj / quantization-notes
Notes on quantization in neural networks
☆95Updated last year
muellerzr / minimal-trainer-zoo
Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines
☆197Updated last year
gpu-mode / profiling-cuda-in-torch
☆162Updated last year
cfregly / ai-performance-engineering
☆86Updated this week
hkproj / triton-flash-attention
☆184Updated 7 months ago
tcapelle / llm_recipes
A set of scripts and notebooks on LLM finetunning and dataset creation
☆110Updated 10 months ago
rasbt / RAGs
RAGs: Simple implementations of Retrieval Augmented Generation (RAG) Systems
☆123Updated 6 months ago
ThinamXx / Meta-llama
Complete implementation of Llama2 with/without KV cache & inference 🚀
☆48Updated last year
hkproj / pytorch-lora
LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch
☆112Updated 2 years ago
a-hamdi / GPU
100 days of building GPU kernels!
☆477Updated 3 months ago
rasbt / pycon2024
Tutorial Materials for "The Fundamentals of Modern Deep Learning with PyTorch" workshop at PyCon 2024
☆247Updated last year
stas00 / ml-ways
ML/DL Math and Method notes
☆62Updated last year
huggingface / optimum-benchmark
🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of O…
☆307Updated 2 months ago
1y33 / 100Days
GPU Kernels
☆191Updated 3 months ago
rasbt / dora-from-scratch
LoRA and DoRA from Scratch Implementations
☆208Updated last year
mlops-discord / talks
Slides and recordings of talks hosted by our community
☆20Updated last year
milistu / cuda-cudnn-installation
How to install CUDA & cuDNN for Machine Learning
☆20Updated last year
rkinas / triton-resources
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
☆383Updated 4 months ago
huggingface / picotron_tutorial
☆208Updated 5 months ago
FrancescoSaverioZuppichini / detector
☆13Updated 2 years ago
warner-benjamin / commented-transformers
Highly commented implementations of Transformers in PyTorch
☆136Updated 2 years ago
aniketmaurya / python-project-template
A template to kick-start your Python project ✨🚀
☆52Updated 3 weeks ago
gpu-mode / lecture2
Obsolete version of CUDA-mode repo -- use cuda-mode/lectures instead
☆26Updated last year
PacktPublishing / Pretrain-Vision-and-Large-Language-Models-in-Python
Pretrain Vision and Large Language Models in Python, Published by Packt
☆88Updated last year
huggingface / gpu-fryer
Where GPUs get cooked 👩‍🍳🔥
☆266Updated last week
LambdaLabsML / distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
☆463Updated 5 months ago