vllm-project / vllm-spyreLinks
Community maintained hardware plugin for vLLM on Spyre
☆35Updated last week
Alternatives and similar repositories for vllm-spyre
Users that are interested in vllm-spyre are comparing it to the libraries listed below
Sorting:
- Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs☆12Updated 6 months ago
- A hierarchical collective communications library with portable optimizations☆36Updated 10 months ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆119Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆83Updated this week
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆41Updated this week
- ☆23Updated last week
- ☆25Updated last month
- Reference implementations of MLPerf™ HPC training benchmarks☆49Updated 7 months ago
- This is the public repo for the MLPerf DeepCAM climate data segmentation proposal.☆16Updated 2 weeks ago
- Multi-GPU communication profiler and visualizer☆35Updated last year
- ☆59Updated this week
- Ongoing research training transformer models at scale☆29Updated this week
- CloudAI Benchmark Framework☆71Updated this week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆63Updated 3 months ago
- TACOS: [T]opology-[A]ware [Co]llective Algorithm [S]ynthesizer for Distributed Machine Learning☆26Updated 4 months ago
- Provides the examples to write and build Habana custom kernels using the HabanaTools☆23Updated 6 months ago
- parser script to process pytorch autograd profiler result, convert json file to excel.☆15Updated 6 years ago
- CUDA GPU Benchmark☆33Updated 8 months ago
- A tool to detect infrastructure issues on cloud native AI systems☆48Updated last month
- ATLAHS: An Application-centric Network Simulator Toolchain for AI, HPC, and Distributed Storage☆43Updated last month
- RCCL Performance Benchmark Tests☆75Updated last week
- A recommendation model kernel optimizing system☆11Updated 4 months ago
- OpenAI Triton backend for Intel® GPUs☆211Updated this week
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆41Updated last year
- Code samples related to Intel(R) AMX☆39Updated last year
- Fast GPU based tensor core reductions☆13Updated 2 years ago
- Slides and exercises for persistent memory programming tutorial☆14Updated 2 years ago
- Cloud Native Benchmarking of Foundation Models☆44Updated 2 months ago
- Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://dev…☆61Updated last month
- Repository for MLCommons Chakra schema and tools☆131Updated 3 weeks ago