epoch-research / Compute-TrendsLinks

Supplementary material for our paper "Compute Trends Across Three Eras of Machine Learning".

☆41

Alternatives and similar repositories for Compute-Trends

Users that are interested in Compute-Trends are comparing it to the libraries listed below

Sorting:

UmerHA / triton_util
Make triton easier
☆47Updated last year
Michaelvll / llm-ie-benchmarks
A collection of reproducible inference engine benchmarks
☆32Updated 3 months ago
Jokeren / triton-samples
☆28Updated 6 months ago
google-research-datasets / tpu_graphs
☆125Updated last year
stanford-cs324 / winter2023
☆37Updated 2 years ago
HabanaAI / Megatron-DeepSpeed
Intel Gaudi's Megatron DeepSpeed Large Language Models for training
☆13Updated 7 months ago
graphcore-research / out-of-the-box-fp8-training
Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.
☆46Updated last year
groq / mlagility
Machine Learning Agility (MLAgility) benchmark and benchmarking tools
☆39Updated last week
graphcore / tutorials
Training material for IPU users: tutorials, feature examples, simple applications
☆86Updated 2 years ago
determined-ai / determined-examples
Example ML projects that use the Determined library.
☆32Updated 10 months ago
ScalingIntelligence / good-kernels
Samples of good AI generated CUDA kernels
☆86Updated 2 months ago
google-deepmind / dks
Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwo…
☆71Updated last month
gpu-mode / discord-cluster-manager
Write a fast kernel and run it on Discord. See how you compare against the best!
☆48Updated last week
softmax1 / Flash-Attention-Softmax-N
CUDA and Triton implementations of Flash Attention with SoftmaxN.
☆72Updated last year
octoml / octoml-profile
Home for OctoML PyTorch Profiler
☆113Updated 2 years ago
HazyResearch / train-tk
train with kittens!
☆62Updated 9 months ago
deepspeedai / DeepSpeed-Kernels
☆74Updated 4 months ago
facebookresearch / MODel_opt
Memory Optimizations for Deep Learning (ICML 2023)
☆102Updated last year
ezyang / torchdbg
PyTorch centric eager mode debugger
☆47Updated 7 months ago
DS3Lab / CocktailSGD
☆27Updated last year
stanford-futuredata / stk
☆108Updated 11 months ago
yandex-research / swarm
Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"
☆141Updated last year
axonn-ai / axonn
A parallel framework for training deep neural networks
☆63Updated 4 months ago
tanyuqian / redco
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference
☆66Updated 8 months ago
Ying1123 / awesome-neural-symbolic
A list of awesome neural symbolic papers.
☆47Updated 3 years ago
srush / triton-autodiff
Experiment of using Tangent to autodiff triton
☆80Updated last year
GindaChen / FlexFlashAttention3
FlexAttention w/ FlashAttention3 Support
☆27Updated 10 months ago
drisspg / transformer_nuggets
A place to store reusable transformer components of my own creation or found on the interwebs
☆59Updated last week
ShishirPatil / poet
ML model training for edge devices
☆165Updated last year
alexzhang13 / Triton-Puzzles-Solutions
Personal solutions to the Triton Puzzles
☆19Updated last year