argonne-lcf / Megatron-DeepSpeedLinks
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆17Updated last week
Alternatives and similar repositories for Megatron-DeepSpeed
Users that are interested in Megatron-DeepSpeed are comparing it to the libraries listed below
Sorting:
- Cosmic Tagging Network for Neutrino Physics☆13Updated last year
- Material for the SC22 Deep Learning at Scale Tutorial☆41Updated last year
- SC24 Deep Learning at Scale Tutorial Material☆33Updated 4 months ago
- ☆37Updated 2 months ago
- A parallel framework for training deep neural networks☆61Updated 3 months ago
- This is a repository with examples to run inference endpoints on various ALCF clusters☆22Updated last week
- Benchmark implementation of CosmoFlow in TensorFlow Keras☆21Updated last year
- Sparsity support for PyTorch☆35Updated 3 months ago
- ☆109Updated 3 months ago
- CPU and GPU tutorial examples☆13Updated 2 months ago
- ☆21Updated 4 years ago
- COCCL: Compression and precision co-aware collective communication library☆22Updated 3 months ago
- A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.☆170Updated this week
- A hands-on introduction to tuning GPU kernels using Kernel Tuner https://github.com/KernelTuner/kernel_tuner/☆31Updated 2 months ago
- Guidelines on using Weights and Biases logging for deep learning applications on NERSC machines☆13Updated last year
- Collection of scripts to build PyTorch and the domain libraries from source.☆12Updated last week
- Cataloging released Triton kernels.☆238Updated 5 months ago
- ☆167Updated last year
- A bunch of kernels that might make stuff slower 😉☆51Updated this week
- A Parallel Code Evaluation Benchmark☆33Updated 2 weeks ago
- AI Training Series Material☆37Updated 9 months ago
- Graph-indexed Pandas DataFrames for analyzing hierarchical performance data☆33Updated last week
- AI Accelerators-SC23-tutorial Repository☆11Updated last year
- ☆18Updated 5 years ago
- Collection of small examples for running on ALCF resources