inclusionAI / asystem-awexLinks
A high-performance RL training-inference weight synchronization framework, designed to enable second-level parameter updates from training to inference in RL workflows
☆129Updated last month
Alternatives and similar repositories for asystem-awex
Users that are interested in asystem-awex are comparing it to the libraries listed below
Sorting:
- ByteCheckpoint: An Unified Checkpointing Library for LFMs☆264Updated last month
- Toolchain built around the Megatron-LM for Distributed Training☆84Updated last month
- The driver for LMCache core to run in vLLM☆60Updated 11 months ago
- ☆340Updated 3 weeks ago
- An early research stage expert-parallel load balancer for MoE models based on linear programming.☆491Updated 2 months ago
- LLM Serving Performance Evaluation Harness☆83Updated 11 months ago
- Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…☆252Updated last week
- Efficient Long-context Language Model Training by Core Attention Disaggregation☆80Updated last month
- torchcomms: a modern PyTorch communications API☆323Updated this week
- [NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank☆67Updated last year
- KV cache store for distributed LLM inference☆387Updated 2 months ago
- ☆31Updated last month
- ArcticInference: vLLM plugin for high-throughput, low-latency inference☆379Updated last week
- Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.☆789Updated this week
- Stateful LLM Serving☆95Updated 10 months ago
- Perplexity open source garden for inference technology☆350Updated last month
- ☆48Updated last year
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆137Updated last year
- Nex Venus Communication Library☆72Updated 2 months ago
- Accelerating MoE with IO and Tile-aware Optimizations☆563Updated last week
- JAX backend for SGL☆232Updated this week
- ☆83Updated 3 months ago
- A NCCL extension library, designed to efficiently offload GPU memory allocated by the NCCL communication library.☆87Updated last month
- Efficient and easy multi-instance LLM serving☆523Updated 4 months ago
- ☆96Updated 10 months ago
- High-performance distributed data shuffling (all-to-all) library for MoE training and inference☆109Updated last month
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆207Updated last year
- ☆73Updated 4 months ago
- DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆90Updated 2 weeks ago
- Fast and memory-efficient exact attention☆110Updated last week