openshift-psap / auto-tuning-vllmView external linksLinks
Auto-tuning for vllm. Getting the best performance out of your LLM deployment (vllm+guidellm+optuna)
☆36Jan 29, 2026Updated 2 weeks ago
Alternatives and similar repositories for auto-tuning-vllm
Users that are interested in auto-tuning-vllm are comparing it to the libraries listed below
Sorting:
- Digital SuperTwin: digital twin of supercomputers☆13Nov 24, 2024Updated last year
- Protocol buffers and other common resources.☆13Jan 20, 2026Updated 3 weeks ago
- This project is based on the [LTX-Video](https://github.com/Lightricks/LTX-Video) algorithm of the diffusers and optimized and accelerate…☆11Dec 31, 2024Updated last year
- Ibexa Experience is a modern modular Digital Experience Platform (DXP) designed for customer-centric companies and organizations who want…☆10Feb 6, 2026Updated last week
- Find the idea for your next project/startup posted by people all over the world. Alternatively, post your idea over the platform and allo…☆11May 13, 2023Updated 2 years ago
- A performance testing and analysis automation framework☆14Jan 26, 2026Updated 2 weeks ago
- Astrology app, with birth chart calculation based on your time and place of birth.☆12Aug 24, 2021Updated 4 years ago
- ANDROID APP to AUTO GENERATE SUBTITLE FILE and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any audio/vide…☆19May 5, 2024Updated last year
- Vectorize HTML files and generate embeddings with structural and semantic expression (WIP)☆11Feb 16, 2023Updated 2 years ago
- The repo of the Doc2SoarGraph framework☆10Sep 17, 2024Updated last year
- [ICML 2025] Efficiently Serving Large Multimodal Models Using EPD Disaggregation☆22May 29, 2025Updated 8 months ago
- daily.dev Android application☆12Feb 3, 2026Updated last week
- A Partytown plugin for Fresh☆12Oct 10, 2023Updated 2 years ago
- Model explanation provides the ability to interpret the effect of the predictors on the composition of an individual score.☆13Jan 21, 2021Updated 5 years ago
- ☆13Jan 7, 2025Updated last year
- 🚀 LLM inference optimization simulator, modeling compute-bound prefill and memory-bound decode phases.☆13Jul 12, 2025Updated 7 months ago
- Deprecated version of CSK, see new one here:☆14Feb 18, 2025Updated 11 months ago
- A rust wrapper for HIP☆12Jun 10, 2025Updated 8 months ago
- 🚸 Introducing Lifetable, add the missing all-in-one community to the spreadsheet database ecology and so much more. Based on Next.js 14 …☆12May 14, 2024Updated last year
- A statistical framework for graph anomaly detection.☆17Sep 23, 2018Updated 7 years ago
- OpenAI compatible API for open source LLMs☆16Oct 30, 2023Updated 2 years ago
- A JavaScript library for creating and editing videos in the browser.☆20Updated this week
- DocGenius AI - Generative AI Chatbot for your Documents☆14Aug 14, 2025Updated 6 months ago
- AutoML 2024: HPOD: Hyperparameter Optimization for Unsupervised Outlier Detection☆12Jul 12, 2024Updated last year
- A collection of development container 'features' for machine learning and data science☆11Nov 20, 2025Updated 2 months ago
- Empowering everyone to create reliable and safety AI coding agent.☆12Sep 2, 2024Updated last year
- A suite of local-first apps, all on one unified account☆15Updated this week
- clustering algorithm implementation☆13Nov 3, 2025Updated 3 months ago
- A conversion tool between scala types and protobuf-java types.☆12Dec 21, 2021Updated 4 years ago
- A modern theme for MediaWiki, built on Bootystrap 3 and Skinny.☆12Oct 27, 2016Updated 9 years ago
- An AI agents framework addressing the two core challenges with real world agents - Optimisation and Deployement☆14Apr 3, 2024Updated last year
- An ecosystem of Rust libraries for working with large language models☆13Oct 2, 2023Updated 2 years ago
- (MacOS Support) OpenAI compatible http server for Spark-TTS☆15May 1, 2025Updated 9 months ago
- Fione is Enterprise AI Platform☆16Nov 9, 2025Updated 3 months ago
- Yet another coding assistant powered by LLM.☆16Sep 11, 2024Updated last year
- The CreateJS build tools & process.☆12Jan 25, 2019Updated 7 years ago
- ☆18Mar 4, 2025Updated 11 months ago
- JAX bindings for the flash-attention3 kernels☆20Jan 2, 2026Updated last month
- Framework to achieve context distillation in LLMs☆15Nov 24, 2023Updated 2 years ago