Open Source Continuous Inference Benchmark Research Platform Kimi K2.6, DeepSeekv4, GLM5 - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 & soon™ TPUv6e/v7/Trainium2/3
☆1,078Jun 11, 2026Updated this week
Alternatives and similar repositories for InferenceX
Users that are interested in InferenceX are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository contains the results and code for the MLPerf™ Inference v4.0 benchmark.☆11Jul 24, 2025Updated 10 months ago
- Offline optimization of your disaggregated Dynamo graph☆335Updated this week
- Kubernetes CSI Driver for serving OCI model artifacts☆27May 25, 2026Updated 3 weeks ago
- The Intelligent Inference Scheduler for Large-scale Inference Services.☆68Feb 12, 2026Updated 4 months ago
- AIPerf is a comprehensive benchmarking tool that measures the performance of generative AI models served by your preferred inference solu…☆368Updated this week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for "What really matters in matrix-whitening optimizers?"☆24Oct 31, 2025Updated 7 months ago
- Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…☆464Updated this week
- See the Wiki page below for details about the SR/IOV patch set☆20Sep 28, 2021Updated 4 years ago
- ☆33Apr 19, 2025Updated last year
- Primus-SaFE(Stability and Fault Endurance)☆56Updated this week
- Achieve state of the art inference performance with modern accelerators on Kubernetes☆3,351Updated this week
- NVIDIA Inference Xfer Library (NIXL)☆1,079Updated this week
- ☆16Jul 8, 2024Updated last year
- A Datacenter Scale Distributed Inference Serving Framework☆7,248Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…☆300May 14, 2026Updated last month
- Resource Exporter for volcano scheduling, e.g. NUMA-Aware scheduling.☆19May 30, 2025Updated last year
- The tool facilitates debugging convergence issues and testing new algorithms and recipes for training LLMs using Nvidia libraries such as…☆20Sep 17, 2025Updated 8 months ago
- A Quirky Assortment of CuTe Kernels☆1,007May 30, 2026Updated 2 weeks ago
- Auto-tuning for vllm. Getting the best performance out of your LLM deployment (vllm+guidellm+optuna)☆57Mar 17, 2026Updated 2 months ago
- WG Serving☆37Mar 24, 2026Updated 2 months ago
- Repository for AI model benchmarking on TT-Buda☆16Feb 9, 2026Updated 4 months ago
- ☆40Dec 14, 2025Updated 6 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆13Updated this week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆16Sep 24, 2024Updated last year
- Empowering LLM Agents for Real-World Computer System Optimization☆18Sep 10, 2025Updated 9 months ago
- LLMPerf is a library for validating and benchmarking LLMs☆1,120Dec 9, 2024Updated last year
- GenAI inference performance benchmarking tool☆195Jun 8, 2026Updated last week
- Benchmark tests supporting the TiledCUDA library.☆19Nov 19, 2024Updated last year
- ☆18May 6, 2026Updated last month
- ☆16Nov 24, 2025Updated 6 months ago
- GPU prices aggregator for cloud providers☆49Updated this week
- A distributed in-memory store for temporal knowledge graphs☆10Mar 20, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This repo hosts code for vLLM CI & Performance Benchmark infrastructure.☆43Updated this week
- LLM training parallelisms (DP, FSDP, TP, PP) in pure C☆29Jan 27, 2026Updated 4 months ago
- ☆92Feb 12, 2026Updated 4 months ago
- A signal processing library in Rust, with the goal of being a decent alternative to Matlab's Signal Processing Toolbox and scipy.signal☆22Updated this week
- redis module unit tests with python (deprecated) please see RLTest☆12Sep 8, 2019Updated 6 years ago
- vLLM Daily Summarization of Merged PRs☆51Jun 7, 2026Updated last week
- Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.☆133Jun 8, 2026Updated last week