inclusionAI / asystem-awexLinks

A high-performance RL training-inference weight synchronization framework, designed to enable second-level parameter updates from training to inference in RL workflows

☆80

Alternatives and similar repositories for asystem-awex

Users that are interested in asystem-awex are comparing it to the libraries listed below

Sorting:

ByteDance-Seed / ByteCheckpoint
ByteCheckpoint: An Unified Checkpointing Library for LFMs
☆252Updated 4 months ago
OpenSQZ / MegatronApp
Toolchain built around the Megatron-LM for Distributed Training
☆76Updated this week
hao-ai-lab / LookaheadReasoning
[NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning
☆52Updated 3 weeks ago
meta-pytorch / torchcomms
torchcomms: a modern PyTorch communications API
☆291Updated this week
hao-ai-lab / vllm-ltr
[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank
☆64Updated last year
radixark / miles
☆199Updated this week
stepfun-ai / StepMesh
☆320Updated last week
sgl-project / genai-bench
Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…
☆232Updated this week
deepseek-ai / LPLB
An early research stage MoE load balancer based on inear programming.
☆228Updated this week
kvcache-ai / TrEnv-X
☆67Updated 2 months ago
RLsys-Foundation / TritonForge
🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation…
☆99Updated 2 weeks ago
project-etalon / etalon
LLM Serving Performance Evaluation Harness
☆80Updated 9 months ago
bytedance / InfiniStore
KV cache store for distributed LLM inference
☆363Updated last week
tyler-griggs / melange-release
☆48Updated last year
mit-han-lab / flash-moba
☆143Updated last week
yaof20 / Flash-RL
Implementation for FP8/INT8 Rollout for RL training without performence drop.
☆275Updated 2 weeks ago
hao-ai-lab / MuxServe
☆79Updated last month
ISEEKYAN / mbridge
Bridge Megatron-Core to Hugging Face/Reinforcement Learning
☆159Updated last week
WukLab / preble
Stateful LLM Serving
☆88Updated 8 months ago
antgroup / DeepXTrace
DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.
☆68Updated 2 weeks ago
sgl-project / sglang-jax
JAX backend for SGL
☆175Updated this week
snowflakedb / ArcticInference
ArcticInference: vLLM plugin for high-throughput, low-latency inference
☆300Updated last week
microsoft / ParrotServe
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
☆196Updated last year
ByteDance-Seed / StragglerAnalysis
☆43Updated 6 months ago
ByteDance-Seed / FlexPrefill
Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
☆154Updated last month
NVIDIA-NeMo / Megatron-Bridge
Training library for Megatron-based models
☆209Updated this week
CalvinXKY / mfu_calculation
A simple calculation for LLM MFU.
☆50Updated 2 months ago
LMCache / lmcache-vllm
The driver for LMCache core to run in vLLM
☆58Updated 9 months ago
sgl-project / sgl-flash-attn
Fast and memory-efficient exact attention
☆14Updated last week
andy-yang-1 / DoubleSparse
16-fold memory access reduction with nearly no loss
☆107Updated 7 months ago