HFAiLab / pytorch_distributedLinks
The test of different distributed-training methods on High-Flyer AIHPC
☆24Updated 2 years ago
Alternatives and similar repositories for pytorch_distributed
Users that are interested in pytorch_distributed are comparing it to the libraries listed below
Sorting:
- differentiable top-k operator☆21Updated 5 months ago
- ☆31Updated last year
- Benchmark tests supporting the TiledCUDA library.☆16Updated 7 months ago
- ☆11Updated last year
- An object detection codebase based on MegEngine.☆28Updated 2 years ago
- Odysseus: Playground of LLM Sequence Parallelism☆70Updated last year
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Updated last year
- ☆19Updated 2 years ago
- ☆71Updated last month
- ☆39Updated this week
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆41Updated last month
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).☆25Updated last year
- Distributed DataLoader For Pytorch Based On Ray☆24Updated 3 years ago
- study of cutlass☆21Updated 7 months ago
- Datasets, Transforms and Models specific to Computer Vision☆85Updated last year
- 方便扩展的Cuda算子理解和优化框架,仅用在学习使用☆15Updated last year
- patches for huggingface transformers to save memory☆24Updated 3 weeks ago
- Summary of system papers/frameworks/codes/tools on training or serving large model☆57Updated last year
- Inference framework for MoE layers based on TensorRT with Python binding☆41Updated 4 years ago
- ☆22Updated last year
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆60Updated last year
- Self Reproduction Code of Paper "Reducing Transformer Key-Value Cache Size with Cross-Layer Attention (MIT CSAIL)☆16Updated last year
- ☆57Updated 3 weeks ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆38Updated last year
- Quantized Attention on GPU☆44Updated 7 months ago
- HFAI deep learning models☆148Updated 2 years ago
- Dynamic Context Selection for Efficient Long-Context LLMs☆33Updated last month
- ☆11Updated 2 years ago
- pytorch-profiler☆51Updated 2 years ago
- [EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling☆86Updated 2 years ago