vllm-project / vllm-spyreLinks
Community maintained hardware plugin for vLLM on Spyre
☆32Updated this week
Alternatives and similar repositories for vllm-spyre
Users that are interested in vllm-spyre are comparing it to the libraries listed below
Sorting:
- llm-d benchmark scripts and tooling☆25Updated last week
- ☆22Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆81Updated this week
- Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs☆12Updated 5 months ago
- A tool to detect infrastructure issues on cloud native AI systems☆47Updated last month
- Cloud Native Benchmarking of Foundation Models☆41Updated last month
- ☆54Updated this week
- A hierarchical collective communications library with portable optimizations☆36Updated 8 months ago
- COCCL: Compression and precision co-aware collective communication library☆24Updated 5 months ago
- ☆12Updated last week
- A recommendation model kernel optimizing system☆10Updated 3 months ago
- This is the public repo for the MLPerf DeepCAM climate data segmentation proposal.☆16Updated 3 years ago
- Systematic and comprehensive benchmarks for LLM systems.☆28Updated 3 weeks ago
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆62Updated 2 months ago
- ☆16Updated 4 months ago
- ☆20Updated last week
- Ongoing research training transformer models at scale☆29Updated this week
- An I/O benchmark for deep Learning applications☆90Updated last week
- CloudAI Benchmark Framework☆71Updated this week
- High-Performance Linpack Benchmark adopted version for GPU backend☆11Updated 2 years ago
- A Micro-benchmarking Tool for HPC Networks☆32Updated last month
- NVIDIA NCCL Tests for Distributed Training☆110Updated last week
- QuickReduce is a performant all-reduce library designed for AMD ROCm that supports inline compression.☆32Updated last week
- Get started with your NVIDIA Arm HPC Developers Kit!☆33Updated 2 years ago
- parser script to process pytorch autograd profiler result, convert json file to excel.☆15Updated 5 years ago
- Reference implementations of MLPerf™ HPC training benchmarks☆49Updated 6 months ago
- A CUTLASS implementation using SYCL☆38Updated this week
- IBM Z Deep Neural Network Library (zDNN) provides an interface for applications making use of Neural Network Processing Assist Facility (…☆16Updated 4 months ago
- MAD (Model Automation and Dashboarding)☆24Updated this week
- oneCCL Bindings for Pytorch*☆101Updated last month