gpauloski / BERT-PyTorchLinks
BERT for Distributed PyTorch + AMP Training
☆12Updated 2 years ago
Alternatives and similar repositories for BERT-PyTorch
Users that are interested in BERT-PyTorch are comparing it to the libraries listed below
Sorting:
- Fast SGEMM emulation on Tensor Cores☆17Updated 11 months ago
- ☆17Updated 2 months ago
- High-Performance Linpack Benchmark adopted version for GPU backend☆12Updated 3 years ago
- ALCF Computational Performance Workshop☆38Updated 3 years ago
- Argonne Leadership Computing Facility OpenCL tutorial☆10Updated 5 months ago
- ext_mpi_collectives☆11Updated 10 months ago
- Acceleration codes for the Ozaki-scheme on integer matrix multiplication units.☆17Updated last month
- This is the public repo for the MLPerf DeepCAM climate data segmentation proposal.☆16Updated 4 months ago
- ☆12Updated 6 months ago
- A GPU performance prediction toolkit for CUDA programs☆18Updated 6 years ago
- ☆11Updated 10 months ago
- Pragmatic, Productive, and Portable Affinity for HPC☆51Updated 3 weeks ago
- Data and reproducibility scripts for the UoB-HPC Performance Portability studies☆18Updated last year
- OpenVINO LLM Benchmark☆11Updated 2 years ago
- Memory Topology for GPUs☆17Updated 2 months ago
- Tools to run and parse MKL verbose mode☆18Updated 3 years ago
- This is the open source version of HPL-MXP. The code performance has been verified on Frontier☆18Updated 7 months ago
- Benchmark implementation of CosmoFlow in TensorFlow Keras☆22Updated 2 years ago
- The ALCF hosts a regular simulation, data, and learning workshop to help users scale their applications. This repository contains the exa…☆75Updated last month
- ☆49Updated 6 months ago
- Material for the SC21 Deep Learning at Scale Tutorial☆27Updated 2 years ago
- CPU and GPU tutorial examples☆13Updated 10 months ago
- OpenMP offload playground☆10Updated last year
- Cosmic Tagging Network for Neutrino Physics☆13Updated last year
- automatic GPU offload for scientific libraries☆16Updated 3 weeks ago
- Reference implementation for the climate segmentation benchmark, based on the Exascale Deep Learning for Climate Analytics work☆10Updated 5 years ago
- Guidelines on using Weights and Biases logging for deep learning applications on NERSC machines☆13Updated 2 years ago
- Distributed Communication-Optimal LU-factorization Algorithm☆12Updated 4 years ago
- Benchmarks☆17Updated 9 months ago
- Exploring Machine Learning methods and workflows in a simplified weather model☆19Updated last year