HFAiLab / pytorch_distributed
The test of different distributed-training methods on High-Flyer AIHPC
☆21Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for pytorch_distributed
- FireFlyer Record file format, writer and reader for DL training samples.☆116Updated last year
- HFAI deep learning models☆87Updated last year
- ☆11Updated last year
- study of cutlass☆19Updated this week
- CVFusion is an open-source deep learning compiler to fuse the OpenCV operators.☆26Updated 2 years ago
- Datasets, Transforms and Models specific to Computer Vision☆82Updated 11 months ago
- CUDA 编程指南学习☆27Updated 6 years ago
- Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616☆129Updated last year
- An object detection codebase based on MegEngine.☆28Updated last year
- ☆99Updated 2 years ago
- ICLR 2021 Stats & Graphs☆31Updated 2 years ago
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Updated last year
- PyTorch Dataset Rank Dataset☆40Updated 3 years ago
- A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, …☆105Updated 10 months ago
- TVMScript kernel for deformable attention☆24Updated 2 years ago
- A Python library transfers PyTorch tensors between CPU and NVMe☆96Updated this week
- Slides with modifications for a course at Tsinghua University.☆57Updated 2 years ago
- OneFlow->ONNX☆42Updated last year
- A simple program scheduler for your code on different devices.☆11Updated 2 months ago
- [EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling☆79Updated last year
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆35Updated 8 months ago
- Odysseus: Playground of LLM Sequence Parallelism☆55Updated 4 months ago
- ☆15Updated 7 months ago
- ☆30Updated last year
- Performance benchmarking with ColossalAI☆39Updated 2 years ago
- Summary of system papers/frameworks/codes/tools on training or serving large model☆56Updated 10 months ago
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆63Updated 2 years ago
- A MoE impl for PyTorch, [ATC'23] SmartMoE☆57Updated last year