Chen-zexi / vllm-cliLinks
A command-line interface tool for serving LLM using vLLM.
☆471Updated 2 weeks ago
Alternatives and similar repositories for vllm-cli
Users that are interested in vllm-cli are comparing it to the libraries listed below
Sorting:
- A modern web interface for managing and interacting with vLLM servers (www.github.com/vllm-project/vllm). Supports both GPU and CPU modes…☆366Updated this week
- ☆442Updated 2 months ago
- ☆265Updated 3 months ago
- Verify Precision of all Kimi K2 API Vendor☆513Updated 2 weeks ago
- The LLM abstraction layer for modern AI agent applications.☆507Updated this week
- [ICLR2026] Test-Time Scaling with Reflective Generative Model☆302Updated 2 weeks ago
- Parallax is a distributed model serving framework that lets you build your own AI cluster anywhere☆1,118Updated this week
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆228Updated 3 months ago
- A list of AI memory projects☆632Updated last year
- Community maintained hardware plugin for vLLM on Apple Silicon☆400Updated last week
- Train embedding and reranker models for retrieval tasks on Apple Silicon with MLX☆173Updated 4 months ago
- Bringing the Unsloth experience to Mac users via Apple's MLX framework☆439Updated last week
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B☆573Updated 2 months ago
- Completed research on semantic retrieval augmented generation through novel semantic similarity graph traversal algorithms.☆267Updated 3 months ago
- Evolve your language agent with Agentic Context Engineering (ACE)☆596Updated 3 weeks ago
- Python Implementation of MUVERA (Multi-Vector Retrieval via Fixed Dimensional Encodings)☆396Updated 2 months ago
- An Automatic Prompt Optimization Framework for Large Language Models☆904Updated 6 months ago
- Open-source framework for holistic, structured repository-level documentation across multilingual codebases☆689Updated last month
- Library for model distillation☆161Updated 5 months ago
- ☆1,205Updated last week
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆460Updated 5 months ago
- ☆197Updated 6 months ago
- A CLI to estimate inference memory requirements for Hugging Face models, written in Python.☆683Updated last week
- REFRAG-style RAG (compress → sense/select → expand) — Single-file reference implementation☆208Updated last month
- AI Agent that researches the lives of historical figures and extracts events into structured JSON timelines using LangGraph multi-agent o…☆227Updated 3 months ago
- "AnyTool: Universal Tool-Use Layer for AI Agents"☆499Updated this week
- An open-source application for building, observing, and collaborating with teams of AI agents.☆420Updated 6 months ago
- ☆82Updated 5 months ago
- Train Large Language Models on MLX.☆258Updated this week
- The Open Deep Research app – generate reports with OSS LLMs☆316Updated 2 weeks ago