A command-line interface tool for serving LLM using vLLM.
☆480Jan 25, 2026Updated last month
Alternatives and similar repositories for vllm-cli
Users that are interested in vllm-cli are comparing it to the libraries listed below
Sorting:
- A CLI tool for managing your locally downloaded Huggingface models and datasets☆34Aug 19, 2025Updated 6 months ago
- ☆37Aug 4, 2025Updated 7 months ago
- ☆17Aug 5, 2025Updated 7 months ago
- Scripts for training Qwen 2.5 VL with ms-swift and GRPO☆12Feb 27, 2025Updated last year
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆900Updated this week
- ☆19Mar 16, 2025Updated 11 months ago
- Developer toolkit to migrate applications from the legacy OpenAI Completions/Chat Completions APIs to the unified Responses API, guided b…☆100Sep 8, 2025Updated 6 months ago
- Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM☆2,817Updated this week
- Supercharge Your LLM with the Fastest KV Cache Layer☆7,272Mar 3, 2026Updated last week
- dify 知识库检索工具☆13Apr 3, 2025Updated 11 months ago
- Community maintained hardware plugin for vLLM on AWS Neuron☆24Feb 26, 2026Updated last week
- GA Grid (Beta) is a distributive in memory Genetic Algorithm (GA) component for Apache Ignite. A GA is a method of solving complex optimi…☆11Nov 14, 2017Updated 8 years ago
- Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.☆745Jun 6, 2025Updated 9 months ago
- The official repo for “Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem” [EMNLP25]☆34Sep 1, 2025Updated 6 months ago
- Code for paper "Analog Foundation Models"☆31Sep 18, 2025Updated 5 months ago
- 100% private AI transcription with an intuitive template system for maximum flexibility☆72Jul 27, 2025Updated 7 months ago
- InferX: Inference as a Service Platform☆172Updated this week
- zero cost Apply/Applicative syntax☆13Mar 2, 2026Updated last week
- Using deep research workflow to generate datasets for finetuning LLMs.☆39Oct 9, 2025Updated 5 months ago
- ☆11May 20, 2022Updated 3 years ago
- ☆19Aug 23, 2025Updated 6 months ago
- Run GEPA on your favorite non-python libraries.☆33Jan 22, 2026Updated last month
- ☆15Apr 26, 2025Updated 10 months ago
- Recursive Self-Aggregation evals on ARC-AGI☆28Jan 26, 2026Updated last month
- Tiny evaluation of leading LLMs on competitive programming problems☆14Nov 28, 2024Updated last year
- Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond☆797Updated this week
- Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue (ACL 2024)☆25Oct 18, 2025Updated 4 months ago
- 一个 oh-my-zsh 插件,让你通过 Kimi K2 (Moonshot) 运行 Claude Code☆16Jul 15, 2025Updated 7 months ago
- Scala implementations of standard algorithms for Multi-Armed Bandits Problem.☆12May 7, 2016Updated 9 years ago
- Short Fourier Transforms for Fresnel-weighted Template summation☆15May 29, 2025Updated 9 months ago
- An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)☆1,866Aug 25, 2025Updated 6 months ago
- [ICLR2026] Test-Time Scaling with Reflective Generative Model☆301Jan 28, 2026Updated last month
- The theory of mind module for the SWE agent☆82Jan 13, 2026Updated last month
- dspy-cli is a tool for creating, developing, testing, and deploying DSPy programs as HTTP APIs.☆122Updated this week
- Coding super-intelligence to find the most optimized Python code. Use it to optimize existing codebases or new Pull requests as a GitHub …☆226Updated this week
- ☆37May 5, 2025Updated 10 months ago
- Try out HallOumi, a state-of-the-art claim verification model in a simple UI!☆42Apr 2, 2025Updated 11 months ago
- ☆30Oct 4, 2025Updated 5 months ago
- A simple wrapper to bring Auggie in to your development lifecycle.☆33Dec 8, 2025Updated 3 months ago