volcengine / veScaleLinks

A PyTorch Native LLM Training Framework

☆875

Alternatives and similar repositories for veScale

Users that are interested in veScale are comparing it to the libraries listed below

Sorting:

LLMServe / DistServe
Disaggregated serving system for Large Language Models (LLMs).
☆709Updated 6 months ago
ByteDance-Seed / Triton-distributed
Distributed Compiler based on Triton for Parallel Systems
☆1,186Updated last week
sail-sg / zero-bubble-pipeline-parallelism
Zero Bubble Pipeline Parallelism
☆433Updated 5 months ago
bytedance / flux
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
☆1,153Updated last month
efeslab / Nanoflow
A throughput-oriented high-performance serving framework for LLMs
☆909Updated this week
antgroup / glake
GLake: optimizing GPU memory management and IO transmission.
☆483Updated 7 months ago
sgl-project / sgl-learning-materials
Materials for learning SGLang
☆618Updated 3 weeks ago
zhuzilin / ring-flash-attention
Ring attention implementation with flash attention
☆903Updated last month
AlibabaPAI / llumnix
Efficient and easy multi-instance LLM serving
☆502Updated last month
perplexityai / pplx-kernels
Perplexity GPU Kernels
☆497Updated last month
feifeibear / long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
☆582Updated last week
sgl-project / SpecForge
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
☆439Updated this week
uccl-project / uccl
Ultra and Unified CCL
☆595Updated this week
ai-dynamo / nixl
NVIDIA Inference Xfer Library (NIXL)
☆673Updated this week
hahnyuan / LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…
☆567Updated last year
FlagOpen / FlagScale
FlagScale is a large model toolkit based on open-sourced projects.
☆364Updated this week
cli99 / llm-analysis
Latency and Memory Analysis of Transformer Models for Training and Inference
☆460Updated 6 months ago
alibaba / rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
☆903Updated this week
microsoft / sarathi-serve
A low-latency & high-throughput serving engine for LLMs
☆431Updated last week
FlagOpen / FlagGems
FlagGems is an operator library for large language models implemented in the Triton Language.
☆703Updated this week
ByteDance-Seed / ByteCheckpoint
ByteCheckpoint: An Unified Checkpointing Library for LFMs
☆249Updated 3 months ago
microsoft / vidur
A large-scale simulation framework for LLM inference
☆459Updated 3 months ago
stepfun-ai / StepMesh
☆307Updated 3 weeks ago
Azure / MS-AMP
Microsoft Automatic Mixed Precision Library
☆626Updated last year
AmadeusChan / Awesome-LLM-System-Papers
☆609Updated 5 months ago
ServerlessLLM / ServerlessLLM
Serverless LLM Serving for Everyone.
☆561Updated last week
mit-han-lab / omniserve
[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Se…
☆770Updated 7 months ago
microsoft / mscclpp
MSCCL++: A GPU-driven communication stack for scalable AI applications
☆425Updated this week
galeselee / Awesome_LLM_System-PaperList
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…
☆278Updated 7 months ago
SiriusNEO / Triton-Puzzles-Lite
Puzzles for learning Triton, play it with minimal environment configuration!
☆549Updated last month