vast-data / VUALinks

VUA stands for 'VAST Undivided Attention'. It's a global KVCache storage solution optimizing LLM time to first token (TTFT) and GPU utilization.

☆18

Alternatives and similar repositories for VUA

Users that are interested in VUA are comparing it to the libraries listed below

Sorting:

NVIDIA / cuda-checkpoint
CUDA checkpoint and restore utility
☆350Updated 5 months ago
ai-dynamo / nixl
NVIDIA Inference Xfer Library (NIXL)
☆484Updated this week
bytedance / InfiniStore
KV cache store for distributed LLM inference
☆292Updated last month
ppl-ai / libfabric-efa-demo
☆63Updated 5 months ago
run-ai / runai-model-streamer
☆230Updated this week
leptonai / gpud
GPUd automates monitoring, diagnostics, and issue identification for GPUs
☆393Updated this week
imbue-ai / cluster-health
☆313Updated 11 months ago
ppl-ai / pplx-kernels
Perplexity GPU Kernels
☆407Updated last week
coreweave / nccl-tests
NVIDIA NCCL Tests for Distributed Training
☆97Updated this week
uccl-project / uccl
Ultra and Unified CCL
☆426Updated this week
ServerlessLLM / ServerlessLLM
Serverless LLM Serving for Everyone.
☆511Updated this week
NVIDIA / nvidia-resiliency-ext
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …
☆193Updated last week
snowflakedb / ArcticInference
ArcticInference: vLLM plugin for high-throughput, low-latency inference
☆189Updated this week
microsoft / mscclpp
MSCCL++: A GPU-driven communication stack for scalable AI applications
☆387Updated this week
AlibabaPAI / llumnix
Efficient and easy multi-instance LLM serving
☆451Updated this week
google / nccl-fastsocket
NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.
☆118Updated last year
triton-inference-server / triton_distributed
☆52Updated 4 months ago
sgl-project / ome
OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)
☆192Updated this week
aws / aws-ofi-nccl
This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.
☆177Updated this week
mlcommons / storage
MLPerf™ Storage Benchmark Suite
☆157Updated last week
pytorch / torchft
Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)
☆365Updated last week
facebookresearch / param
PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…
☆147Updated this week
microsoft / NPKit
NCCL Profiling Kit
☆139Updated last year
IBM / autopilot
A tool to detect infrastructure issues on cloud native AI systems
☆44Updated last week
NVIDIA / cloudai
CloudAI Benchmark Framework
☆69Updated this week
argonne-lcf / dlio_benchmark
An I/O benchmark for deep Learning applications
☆88Updated last month
NVIDIA / DCGM
NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs
☆549Updated 2 months ago
microsoft / superbenchmark
A validation and profiling tool for AI infrastructure
☆323Updated last week
SJTU-IPADS / PhoenixOS
Fast OS-level support for GPU checkpoint and restore
☆218Updated last week
Azure / msccl
Microsoft Collective Communication Library
☆63Updated 8 months ago