gpauloski / BERT-PyTorch
BERT for Distributed PyTorch + AMP Training
☆12Updated last year
Related projects ⓘ
Alternatives and complementary repositories for BERT-PyTorch
- ☆11Updated 3 years ago
- ☆55Updated 5 months ago
- ☆26Updated 3 years ago
- ☆29Updated 5 months ago
- A logging tool for deep learning.☆51Updated 2 years ago
- A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch☆19Updated last week
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆13Updated last month
- Performance benchmarking with ColossalAI☆39Updated 2 years ago
- Benchmarking PyTorch 2.0 different models☆21Updated last year
- ☆26Updated last year
- ☆35Updated last year
- benchmarking some transformer deployments☆26Updated last year
- ☆22Updated 10 months ago
- Material for the SC21 Deep Learning at Scale Tutorial☆25Updated last year
- OpenVINO LLM Benchmark☆11Updated 11 months ago
- A plug-in of Microsoft DeepSpeed to fix the bug of DeepSpeed pipeline☆26Updated 3 years ago
- SParse AcceleRation on Tensor Architecture☆17Updated last month
- CUDA 12.2 HMM demos☆17Updated 3 months ago
- Linear Attention Sequence Parallelism (LASP)☆64Updated 5 months ago
- MLBench Framework Core Python Library☆16Updated last year
- ☆20Updated last year
- Benchmarks to capture important workloads.☆28Updated 5 months ago
- Tensor Parallelism with JAX + Shard Map☆11Updated last year
- MLPerf™ logging library☆30Updated last week
- ☆18Updated 6 months ago
- Distributed DataLoader For Pytorch Based On Ray☆24Updated 3 years ago
- Awesome Triton Resources☆18Updated 3 weeks ago
- Unit Scaling demo and experimentation code☆16Updated 8 months ago
- Inference framework for MoE layers based on TensorRT with Python binding☆41Updated 3 years ago
- Distributed K-FAC Preconditioner for PyTorch☆80Updated this week