Open Source Continuous Inference Benchmarking Qwen3.5, DeepSeek, GPTOSS - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 vs H100 & soon™ TPUv6e/v7/Trainium2/3
☆623Mar 5, 2026Updated this week
Alternatives and similar repositories for InferenceX
Users that are interested in InferenceX are comparing it to the libraries listed below
Sorting:
- Offline optimization of your disaggregated Dynamo graph☆195Updated this week
- This repository contains the results and code for the MLPerf™ Inference v4.0 benchmark.☆11Jul 24, 2025Updated 7 months ago
- The Intelligent Inference Scheduler for Large-scale Inference Services.☆64Feb 12, 2026Updated 3 weeks ago
- Code for "What really matters in matrix-whitening optimizers?"☆22Oct 31, 2025Updated 4 months ago
- A python library for connecting to Livox LIDAR devices☆16Apr 14, 2024Updated last year
- ☆12Apr 4, 2022Updated 3 years ago
- ☆31Apr 19, 2025Updated 10 months ago
- ☆27Oct 15, 2025Updated 4 months ago
- ☆16Nov 24, 2025Updated 3 months ago
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- A Triton-only attention backend for vLLM☆24Feb 11, 2026Updated 3 weeks ago
- An inspection tool for sensor_msgs/PointCloud2 messages [ROS1/ROS2]☆19Jan 20, 2023Updated 3 years ago
- A ROS 2 Wrapper for GMSL camera☆16Nov 8, 2022Updated 3 years ago
- LLM training parallelisms (DP, FSDP, TP, PP) in pure C☆26Jan 27, 2026Updated last month
- Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serv…☆275Updated this week
- A Path Planner implemented in C++ to drive in Highway using data from Sensor Fusion, Localization and generating waypoints passed to Cont…☆14May 29, 2018Updated 7 years ago
- NVIDIA Inference Xfer Library (NIXL)☆898Feb 28, 2026Updated last week
- Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…☆384Updated this week
- Development of a virtual simulation platform for autonomous vehicle sensing, mapping, control and behaviour methods using ROS and Gazebo.…☆18Jun 15, 2021Updated 4 years ago
- Tensorflow 2.9 Pipeline for Semantic Point Cloud Segmentation with SqueezeSeqV2, Darknet21 and Darknet53.☆24Sep 5, 2022Updated 3 years ago
- Official Project Page for HLA: Higher-order Linear Attention (https://arxiv.org/abs/2510.27258)☆45Jan 6, 2026Updated 2 months ago
- ☆16Jul 8, 2024Updated last year
- A multi-level dataflow tracer for capturing I/O calls from workflows.☆20Feb 23, 2026Updated last week
- ☆39Dec 14, 2025Updated 2 months ago
- A Quirky Assortment of CuTe Kernels☆838Updated this week
- Control vehicle with two steerable axes using ros2_control☆20Sep 20, 2021Updated 4 years ago
- A computationally efficient and robust LiDAR-inertial odometry (LIO) package☆17Feb 3, 2022Updated 4 years ago
- Person-MinkUNet. Winner of JRDB 3D detection challenge in JRDB-ACT Workshop at CVPR 2021. https://arxiv.org/abs/2107.06780☆23Mar 20, 2023Updated 2 years ago
- FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones☆64Jan 26, 2026Updated last month
- ros interface for cupoch, inspired by perception_open3d☆23Mar 10, 2022Updated 3 years ago
- ☆45Updated this week
- ☆44Feb 27, 2026Updated last week
- ros2 packages for torch2trt examples☆21Jul 3, 2022Updated 3 years ago
- vLLM Daily Summarization of Merged PRs☆46Updated this week
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆880Updated this week
- Achieve state of the art inference performance with modern accelerators on Kubernetes☆2,543Updated this week
- Forward Modeling for Partial Observation Strategy Games - A StarCraft Defogger☆33Aug 30, 2021Updated 4 years ago
- AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming☆179Updated this week
- KV cache store for distributed LLM inference☆396Nov 13, 2025Updated 3 months ago