Crisescode / distributed-training-dlLinks

各种深度学习（DL）框架分布式训练，包括：Tensorflow、Tensorflow2、Pytorch、Chainer、Caffe、Mxnet ...

☆21

Alternatives and similar repositories for distributed-training-dl

Users that are interested in distributed-training-dl are comparing it to the libraries listed below

Sorting:

Oneflow-Inc / DLPerf
DeepLearning Framework Performance Profiling Toolkit
☆285Updated 3 years ago
qsyao / cudaBERT
A Fast Muti-processing BERT-Inference System
☆101Updated 2 years ago
Oneflow-Inc / OneFlow-Benchmark
OneFlow models for benchmarking.
☆104Updated last year
cap-ntu / ML-Model-CI
MLModelCI is a complete MLOps platform for managing, converting, profiling, and deploying MLaaS (Machine Learning-as-a-Service), bridging…
☆193Updated 2 years ago
aliyun / alibabacloud-aiacc-demo
alibabacloud-aiacc-demo
☆43Updated 2 years ago
eedalong / Dpex
Distributed DataLoader For Pytorch Based On Ray
☆24Updated 3 years ago
layerism / brpc_faiss_server
Vector Search Engine base on BRPC + FAISS
☆149Updated 5 years ago
Oneflow-Inc / models
Models and examples built with OneFlow
☆98Updated 9 months ago
chncwang / InsNet
InsNet Runs Instance-dependent Neural Networks with Padding-free Dynamic Batching.
☆66Updated 3 years ago
Angel-ML / PyTorch-On-Angel
PyTorch On Angel, arming PyTorch with a powerful Parameter Server, which enable PyTorch to train very big models.
☆168Updated last month
YellowOldOdd / SDBI
Simple Dynamic Batching Inference
☆145Updated 3 years ago
Tencent / WeChat-TFCC
☆127Updated 4 years ago
alibaba / FastNN
FastNN provides distributed training examples that use EPL.
☆83Updated 3 years ago
Rayrtfr / FasterTransformer
Transformer related optimization, including BERT, GPT
☆17Updated 2 years ago
triton-inference-server / hugectr_backend
☆55Updated last year
volcengine / veGiantModel
☆220Updated last year
Oneflow-Inc / oneflow-documentation
oneflow documentation
☆69Updated last year
Oneflow-Inc / oneflow-xrt
☆23Updated 2 years ago
keithyin / read-pytorch-source-code
pytorch源码阅读 0.2.0 版本
☆90Updated 5 years ago
feifeibear / PyTorchMemTracer
Depict GPU memory footprint during DNN training of PyTorch
☆11Updated 2 years ago
alibaba / EasyParallelLibrary
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
☆267Updated 2 years ago
HadXu / Thunder
A small deep-learning framework with C++/Python/CUDA
☆54Updated 7 years ago
zakheav / automatic-differentiation-framework
an automatic differentiation framework with dynamic graph/支持动态图的自动求导框架
☆100Updated 5 years ago
Oneflow-Inc / serving
OneFlow Serving
☆20Updated 3 months ago
maxwellzh / Pico
Pico is a numpy-based "pico" neural network framework, with torch-like coding style and auto-grad implementation., with MNIST example.
☆11Updated 3 years ago
openmlsys / openmlsys-cuda
Tutorials for writing high-performance GPU operators in AI frameworks.
☆130Updated last year
Harry-Chen / InfMoE
Inference framework for MoE layers based on TensorRT with Python binding
☆41Updated 4 years ago
MachineLP / QDServing
ml模型分布式服务部署：grpc，flask；docker
☆75Updated 4 years ago
OpenPPL / ppl.llm.serving
☆128Updated 7 months ago
yandili / forge_load
空闲GPU和CPU占用程序
☆36Updated last year