FlagOpen / FlagScaleLinks

FlagScale is a large model toolkit based on open-sourced projects.

☆362

Alternatives and similar repositories for FlagScale

Users that are interested in FlagScale are comparing it to the libraries listed below

Sorting:

InternLM / InternEvo
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencie…
☆410Updated 2 months ago
sgl-project / SpecForge
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
☆428Updated last week
Tencent / KsanaLLM
☆507Updated last month
feifeibear / long-context-attention
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
☆576Updated last week
alibaba / rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
☆892Updated this week
alibaba / ChatLearn
A flexible and efficient training framework for large-scale alignment tasks
☆431Updated this week
LLMServe / DistServe
Disaggregated serving system for Large Language Models (LLMs).
☆706Updated 6 months ago
hahnyuan / LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…
☆567Updated last year
alibaba / Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
☆660Updated last year
modelscope / dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …
☆266Updated 2 months ago
madsys-dev / deepseekv2-profile
☆148Updated 7 months ago
sail-sg / zero-bubble-pipeline-parallelism
Zero Bubble Pipeline Parallelism
☆432Updated 5 months ago
volcengine / veScale
A PyTorch Native LLM Training Framework
☆875Updated last month
alibaba / Pai-Megatron-Patch
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
☆1,380Updated last week
kwai / Megatron-Kwai
[USENIX ATC '24] Accelerating the Training of Large Language Models using Efficient Activation Rematerialization and Optimal Hybrid Paral…
☆65Updated last year
sgl-project / sgl-learning-materials
Materials for learning SGLang
☆615Updated 3 weeks ago
FlagOpen / FlagGems
FlagGems is an operator library for large language models implemented in the Triton Language.
☆696Updated this week
DeepLink-org / dlinfer
☆63Updated last week
OpenPPL / ppl.llm.serving
☆129Updated 9 months ago
Ascend / pytorch
Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch
☆439Updated last month
fanshiqing / grouped_gemm
PyTorch bindings for CUTLASS grouped GEMM.
☆154Updated last week
feifeibear / LLMSpeculativeSampling
Fast inference from large lauguage models via speculative decoding
☆835Updated last year
Ascend / AscendSpeed
☆79Updated last year
ModelTC / LightCompress
A powerful toolkit for compressing large models including LLM, VLM, and video generation models.
☆593Updated 2 months ago
FlagOpen / FlagAttention
A collection of memory efficient attention operators implemented in the Triton language.
☆281Updated last year
antgroup / glake
GLake: optimizing GPU memory management and IO transmission.
☆481Updated 6 months ago
bytedance / ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
☆479Updated last year
stepfun-ai / Step3
☆428Updated 2 months ago
intelligent-machine-learning / atorch
An industrial extension library of pytorch to accelerate large scale model training
☆49Updated 2 months ago
zhuzilin / ring-flash-attention
Ring attention implementation with flash attention
☆901Updated last month