volcengine / veGiantModelLinks

☆219

Alternatives and similar repositories for veGiantModel

Users that are interested in veGiantModel are comparing it to the libraries listed below

Sorting:

alibaba / EasyParallelLibrary
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
☆269Updated 2 years ago
Oneflow-Inc / libai
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
☆407Updated 2 months ago
Oneflow-Inc / DLPerf
DeepLearning Framework Performance Profiling Toolkit
☆292Updated 3 years ago
bytedance / effective_transformer
Running BERT without Padding
☆475Updated 3 years ago
bytedance / ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
☆479Updated last year
Ascend / AscendSpeed
☆79Updated last year
Oneflow-Inc / OneFlow-Benchmark
OneFlow models for benchmarking.
☆104Updated last year
DeepRec-AI / HybridBackend
A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster
☆159Updated last year
THUDM / FasterTransformer
Transformer related optimization, including BERT, GPT
☆39Updated 2 years ago
OpenPPL / ppl.llm.serving
☆129Updated 9 months ago
void-main / FasterTransformer
Transformer related optimization, including BERT, GPT
☆59Updated 2 years ago
alibaba / TePDist
TePDist (TEnsor Program DISTributed) is an HLO-level automatic distributed system for DL models.
☆97Updated 2 years ago
Oneflow-Inc / oneflow-documentation
oneflow documentation
☆69Updated last year
triton-inference-server / hugectr_backend
☆56Updated 2 years ago
alibaba / Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
☆660Updated last year
Oneflow-Inc / models
Models and examples built with OneFlow
☆100Updated last year
YellowOldOdd / SDBI
Simple Dynamic Batching Inference
☆145Updated 3 years ago
NetEase-FuXi / EET
Easy and Efficient Transformer : Scalable Inference Solution For Large NLP model
☆265Updated 10 months ago
PKU-DAIR / Hetu
A high-performance distributed deep learning system targeting large-scale and automated distributed training.
☆324Updated 2 months ago
FlagOpen / FlagScale
FlagScale is a large model toolkit based on open-sourced projects.
☆362Updated this week
Tencent / KsanaLLM
☆507Updated last month
sail-sg / zero-bubble-pipeline-parallelism
Zero Bubble Pipeline Parallelism
☆432Updated 5 months ago
antgroup / glake
GLake: optimizing GPU memory management and IO transmission.
☆481Updated 6 months ago
AlibabaPAI / torchacc
PyTorch distributed training acceleration framework
☆53Updated 2 months ago
OpenPPL / ppl.nn.llm
☆139Updated last year
Hsword / Hetu
A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, …
☆121Updated last year
hpcaitech / EnergonAI
Large-scale model inference.
☆631Updated 2 years ago
Jack47 / hack-SysML
The road to hack SysML and become an system expert
☆498Updated last year
kwai / Megatron-Kwai
[USENIX ATC '24] Accelerating the Training of Large Language Models using Efficient Activation Rematerialization and Optimal Hybrid Paral…
☆65Updated last year
alibaba / rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
☆892Updated this week