vast-data / VUALinks
VUA stands for 'VAST Undivided Attention'. It's a global KVCache storage solution optimizing LLM time to first token (TTFT) and GPU utilization.
☆18Updated last month
Alternatives and similar repositories for VUA
Users that are interested in VUA are comparing it to the libraries listed below
Sorting:
- CUDA checkpoint and restore utility☆350Updated 5 months ago
- NVIDIA Inference Xfer Library (NIXL)☆484Updated this week
- KV cache store for distributed LLM inference☆292Updated last month
- ☆63Updated 5 months ago
- ☆230Updated this week
- GPUd automates monitoring, diagnostics, and issue identification for GPUs☆393Updated this week
- ☆313Updated 11 months ago
- Perplexity GPU Kernels☆407Updated last week
- NVIDIA NCCL Tests for Distributed Training☆97Updated this week
- Ultra and Unified CCL☆426Updated this week
- Serverless LLM Serving for Everyone.☆511Updated this week
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆193Updated last week
- ArcticInference: vLLM plugin for high-throughput, low-latency inference☆189Updated this week
- MSCCL++: A GPU-driven communication stack for scalable AI applications☆387Updated this week
- Efficient and easy multi-instance LLM serving☆451Updated this week
- NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.☆118Updated last year
- ☆52Updated 4 months ago
- OME is a Kubernetes operator for enterprise-grade management and serving of Large Language Models (LLMs)☆192Updated this week
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆177Updated this week
- MLPerf™ Storage Benchmark Suite☆157Updated last week
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆365Updated last week
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆147Updated this week
- NCCL Profiling Kit☆139Updated last year
- A tool to detect infrastructure issues on cloud native AI systems☆44Updated last week
- CloudAI Benchmark Framework☆69Updated this week
- An I/O benchmark for deep Learning applications☆88Updated last month
- NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs☆549Updated 2 months ago
- A validation and profiling tool for AI infrastructure☆323Updated last week
- Fast OS-level support for GPU checkpoint and restore☆218Updated last week
- Microsoft Collective Communication Library☆63Updated 8 months ago