A high-throughput and memory-efficient inference and serving engine for LLMs
☆13Feb 11, 2026Updated last month
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The driver for LMCache core to run in vLLM☆63Feb 4, 2025Updated last year
- Elaina is a wavefront implementation of walk on stars. (Code for SIGGRAPH 2025 paper "Guiding-Based Importance Sampling for Walk on Stars…☆28Oct 7, 2025Updated 5 months ago
- Java 8 Streams C++ port☆15May 9, 2022Updated 3 years ago
- This is my Master thesis which evaluates 6D pose estimating deep learning methods for usage in an AR use case. It includes 2 new proxies …☆17Feb 7, 2020Updated 6 years ago
- Horizontal Fusion☆24Jan 7, 2022Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- 华为集合通信性能测试☆15May 27, 2024Updated last year
- Pytorch--使用伪标签训练efficientNet模型☆11Dec 28, 2019Updated 6 years ago
- ☆11Sep 7, 2024Updated last year
- Metis: Understanding and Enhancing Regular Expressions in Network☆14Aug 19, 2022Updated 3 years ago
- Expert Kit is an efficient foundation of Expert Parallelism (EP) for MoE model Inference on heterogenous hardware☆61Jan 28, 2026Updated 2 months ago
- The audio player for Flutter with a heart of gold☆13May 13, 2023Updated 2 years ago
- Cluster management tools for the Hydro stack☆19Feb 5, 2021Updated 5 years ago
- See vLLM official support: https://github.com/vllm-project/vllm-ascend☆11Feb 5, 2025Updated last year
- Source and solution codes for Professional CUDA C Programming book.☆15Aug 20, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- (Obsoleted) A speech signal processing library designed for CVE of Rocaloid Project.☆15Dec 7, 2013Updated 12 years ago
- Source Code for Partial Interference☆10Dec 17, 2022Updated 3 years ago
- ☆16Feb 18, 2026Updated last month
- A Distributed Analysis and Benchmarking Framework for Apache OpenWhisk Serverless Platform☆12Dec 11, 2018Updated 7 years ago
- The code for paper 'Hierarchical Policy for Non-prehensile Multi-object Rearrangement with Deep Reinforcement Learning and Monte Carlo Tr…☆21Aug 18, 2023Updated 2 years ago
- OmniMCP uses Microsoft OmniParser and Model Context Protocol (MCP) to provide AI models with rich UI context and powerful interaction cap…☆71Apr 8, 2025Updated 11 months ago
- ove2xml is a handy, easy to use application specially designed to help you convert music notation software Overture 's document to MusicX…☆13Oct 12, 2015Updated 10 years ago
- Robotic platform for industrial control systems cybersecurity research. We use the research-grade Youbot as the robotics platform for ou…☆27Aug 6, 2015Updated 10 years ago
- EleutherAI ML Performance reading group repository (slides, meeting recordings, annotated papers)☆31Mar 20, 2026Updated last week
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Repository for implementation of active learning and semi-supervised learning algorithms and applying them to medical imaging datasets☆16May 17, 2021Updated 4 years ago
- Code for undergraduate thesis "Active Learning for Deep Object Detection".☆14Nov 12, 2023Updated 2 years ago
- Zoom in Lesions for Better Diagnosis: Attention Guided Deformation Network for WCE Image Classification☆13Aug 4, 2020Updated 5 years ago
- Mathematical expression evaluator with just in time code generation.☆12Apr 7, 2013Updated 12 years ago
- A simple LaTeX template for CUHK thesis.☆14Apr 24, 2023Updated 2 years ago
- [已弃用] QChatGPT 项目的同类模型切换器插件☆21Aug 13, 2024Updated last year
- Distributed SDDMM Kernel☆12Jul 8, 2022Updated 3 years ago
- Crawl phone information from Taobao and JD, clean those raw data. Use those data to analyze and compare prices of different phone models.…☆12Apr 15, 2020Updated 5 years ago
- Main repository of the BeFaaS project☆15Jun 29, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Official implementation of the papers "User-controlled federated matrix factorization for recommender systems" and "FedeRank: User Contro…☆18Jul 28, 2020Updated 5 years ago
- Papers related to the Recommender System from SIGIR 2021 (including the links for Paper PDF, Github Code and Dataset)☆24Jun 9, 2021Updated 4 years ago
- ☆12Mar 31, 2021Updated 4 years ago
- LLM serving cluster simulator☆140Apr 25, 2024Updated last year
- 基于Vue3的主页☆48Sep 10, 2025Updated 6 months ago
- A reimplemented sync server for Obsidian via reverse engineering.☆21May 11, 2024Updated last year
- ☆12Jun 3, 2019Updated 6 years ago