Cloud Native Benchmarking of Foundation Models
☆45Jul 31, 2025Updated 9 months ago
Alternatives and similar repositories for fmperf
Users that are interested in fmperf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Predict the performance of LLM inference services☆23Sep 18, 2025Updated 7 months ago
- Community maintained hardware plugin for vLLM on Spyre☆51Updated this week
- Test Orchestrator for Performance and Scalability of AI pLatforms☆17Apr 22, 2026Updated last week
- A tool to detect infrastructure issues on cloud native AI systems☆53Sep 18, 2025Updated 7 months ago
- How to create and record demos in terminal sessions☆11May 3, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- DRANET is a Kubernetes Network Driver that uses Dynamic Resource Allocation (DRA) to deliver high-performance networking for demanding ap…☆98Apr 28, 2026Updated last week
- ☆17May 8, 2020Updated 5 years ago
- Very fast C++ importer from csv files to sqlite3 databases☆15Mar 29, 2016Updated 10 years ago
- Variant optimization autoscaler for distributed inference workloads☆38Updated this week
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆214Sep 21, 2024Updated last year
- Systematic and comprehensive benchmarks for LLM systems.☆57Jan 28, 2026Updated 3 months ago
- Prometheus collector and exporter for Slurm cluster metrics. A Slinky project.☆16Nov 7, 2025Updated 5 months ago
- ☆18Dec 4, 2025Updated 5 months ago
- Lithops-based Serverless implementation of the METASPACE spatial metabolomics annotation pipeline☆12Jul 6, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆25Aug 7, 2017Updated 8 years ago
- Dynamic configuration management for Kubernetes☆26Updated this week
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆134Feb 22, 2024Updated 2 years ago
- ☆12Apr 4, 2022Updated 4 years ago
- An open source benchmarking framework for IT automation☆312Updated this week
- OpenAPI Golang client library for Slurm REST API. A Slinky project.☆28Updated this week
- CAShift: Benchmarking Log-Based Cloud Attack Detection under Normality Shift (FSE 2025)☆13May 19, 2025Updated 11 months ago
- The living Trust and Safety User Guide for the AI Alliance (https://thealliance.ai)☆15Apr 6, 2026Updated 3 weeks ago
- Helm charts for llm-d☆52Jul 22, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Collect information about 2018 CS courses in CSE of SYSU.☆11Jun 29, 2022Updated 3 years ago
- From Task-based to Instruction-based Automated Log Analysis☆23Jan 7, 2025Updated last year
- Region-level profiling for CUDA kernels with trace, NVBit, CUPTI, NSys, and an interactive Explorer.☆113Apr 17, 2026Updated 2 weeks ago
- Gateway API Inference Extension☆660Updated this week
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Official Tensorflow implementation for "Improving the Transferability of Adversarial Samples by Path-Augmented Method" (CVPR 2023).☆12Jun 16, 2023Updated 2 years ago
- This repository manifests set which is made to build a prototype system of TraceZip, made by 4 pieces.☆14Jul 17, 2025Updated 9 months ago
- GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM☆183Jul 12, 2024Updated last year
- Simulator for the datacenter, including power, cooling, server and other components☆17Feb 12, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A high-throughput and memory-efficient inference and serving engine for LLMs☆27Updated this week
- ☆17May 29, 2025Updated 11 months ago
- ☆10Jun 4, 2024Updated last year
- Health checks for Azure N- and H-series VMs.☆57Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆87Apr 28, 2026Updated last week
- ☆18Apr 22, 2026Updated last week
- MPI Benchmark on AWS HPC cluster☆20Jan 31, 2020Updated 6 years ago