DS3Lab / DT-FMLinks

☆93

Alternatives and similar repositories for DT-FM

Users that are interested in DT-FM are comparing it to the libraries listed below

Sorting:

zhengzangw / Sequence-Scheduling
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
☆92Updated 2 years ago
SymbioticLab / Oobleck
A resilient distributed training framework
☆96Updated last year
RulinShao / FastCkpt
Python package for rematerialization-aware gradient checkpointing
☆27Updated last year
thu-pacman / FasterMoE
☆87Updated 3 years ago
DS3Lab / Decentralized_FM_alpha
☆19Updated 2 years ago
DachengLi1 / AMP
(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.
☆41Updated 2 years ago
stanford-futuredata / stk
☆112Updated last year
zhuohan123 / terapipe
☆75Updated 4 years ago
RulinShao / LightSeq
Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training
☆216Updated last year
Distributed-AI / PipeTransformer
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021
☆56Updated 4 years ago
microsoft / varuna
☆252Updated last year
DS3Lab / AC-SGD
Code associated with the paper **Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees**.
☆27Updated 2 years ago
Youhe-Jiang / IJCAI2023-OptimalShardedDataParallel
[IJCAI2023] An automated parallel training system that combines the advantages from both data and model parallelism. If you have any inte…
☆52Updated 2 years ago
spcl / substation
Research and development for optimizing transformers
☆131Updated 4 years ago
hpcaitech / TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
☆120Updated 11 months ago
Hsword / Hetu
A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, …
☆122Updated last year
Hsword / SpotServe
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆130Updated last year
kssteven418 / BigLittleDecoder
[NeurIPS'23] Speculative Decoding with Big Little Decoder
☆94Updated last year
shawntan / scattermoe
Triton-based implementation of Sparse Mixture of Experts.
☆246Updated 3 weeks ago
Relaxed-System-Lab / HexGen
[ICML 2024] Serving LLMs on heterogeneous decentralized clusters.
☆30Updated last year
hao-ai-lab / vllm-ltr
[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank
☆60Updated 11 months ago
WukLab / InferCept
☆30Updated last year
tanyuqian / redco
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference
☆68Updated 10 months ago
epfml / powersgd
Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727
☆148Updated 11 months ago
exists-forall / striped_attention
☆41Updated last year
awslabs / slapo
A schedule language for large model training
☆151Updated 2 months ago
SymbioticLab / ModelKeeper
A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup
☆35Updated 2 years ago
microsoft / ParrotServe
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
☆188Updated last year
project-etalon / etalon
LLM Serving Performance Evaluation Harness
☆79Updated 8 months ago
Ying1123 / llm-caching-multiplexing
☆20Updated 2 years ago