A high-throughput and memory-efficient inference and serving engine for LLMs
☆13Apr 13, 2026Updated last week
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The driver for LMCache core to run in vLLM☆64Feb 4, 2025Updated last year
- Elaina is a wavefront implementation of walk on stars. (Code for SIGGRAPH 2025 paper "Guiding-Based Importance Sampling for Walk on Stars…☆28Oct 7, 2025Updated 6 months ago
- Java 8 Streams C++ port☆15May 9, 2022Updated 3 years ago
- This is my Master thesis which evaluates 6D pose estimating deep learning methods for usage in an AR use case. It includes 2 new proxies …☆17Feb 7, 2020Updated 6 years ago
- FlashSampling: Fast and Memory-Efficient Exact Sampling (https://huggingface.co/papers/2603.15854)☆66Apr 9, 2026Updated last week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Horizontal Fusion☆24Jan 7, 2022Updated 4 years ago
- 华为集合通信性能测试☆16May 27, 2024Updated last year
- Pytorch--使用伪标签训练efficientNet模型☆11Dec 28, 2019Updated 6 years ago
- Metis: Understanding and Enhancing Regular Expressions in Network☆14Aug 19, 2022Updated 3 years ago
- ☆12Sep 7, 2024Updated last year
- Expert Kit is an efficient foundation of Expert Parallelism (EP) for MoE model Inference on heterogenous hardware☆63Jan 28, 2026Updated 2 months ago
- Cluster management tools for the Hydro stack☆19Feb 5, 2021Updated 5 years ago
- The audio player for Flutter with a heart of gold☆13May 13, 2023Updated 2 years ago
- See vLLM official support: https://github.com/vllm-project/vllm-ascend☆11Feb 5, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Source and solution codes for Professional CUDA C Programming book.☆15Aug 20, 2020Updated 5 years ago
- (Obsoleted) A speech signal processing library designed for CVE of Rocaloid Project.☆15Dec 7, 2013Updated 12 years ago
- Source Code for Partial Interference☆10Dec 17, 2022Updated 3 years ago
- A Distributed Analysis and Benchmarking Framework for Apache OpenWhisk Serverless Platform☆12Dec 11, 2018Updated 7 years ago
- ☆17Feb 18, 2026Updated 2 months ago
- The code for paper 'Hierarchical Policy for Non-prehensile Multi-object Rearrangement with Deep Reinforcement Learning and Monte Carlo Tr…☆21Aug 18, 2023Updated 2 years ago
- OmniMCP uses Microsoft OmniParser and Model Context Protocol (MCP) to provide AI models with rich UI context and powerful interaction cap…☆71Apr 8, 2025Updated last year
- ove2xml is a handy, easy to use application specially designed to help you convert music notation software Overture 's document to MusicX…☆13Oct 12, 2015Updated 10 years ago
- Robotic platform for industrial control systems cybersecurity research. We use the research-grade Youbot as the robotics platform for ou…☆27Aug 6, 2015Updated 10 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- EleutherAI ML Performance reading group repository (slides, meeting recordings, annotated papers)☆31Mar 20, 2026Updated last month
- Repository for implementation of active learning and semi-supervised learning algorithms and applying them to medical imaging datasets☆16May 17, 2021Updated 4 years ago
- Code for undergraduate thesis "Active Learning for Deep Object Detection".☆14Nov 12, 2023Updated 2 years ago
- Zoom in Lesions for Better Diagnosis: Attention Guided Deformation Network for WCE Image Classification☆13Aug 4, 2020Updated 5 years ago
- Mathematical expression evaluator with just in time code generation.☆12Apr 7, 2013Updated 13 years ago
- A simple LaTeX template for CUHK thesis.☆16Apr 24, 2023Updated 2 years ago
- [已弃用] QChatGPT 项目的同类模型切换器插件☆21Aug 13, 2024Updated last year
- Distributed SDDMM Kernel☆12Jul 8, 2022Updated 3 years ago
- Crawl phone information from Taobao and JD, clean those raw data. Use those data to analyze and compare prices of different phone models.…☆12Apr 15, 2020Updated 6 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Main repository of the BeFaaS project☆15Jun 29, 2023Updated 2 years ago
- Official implementation of the papers "User-controlled federated matrix factorization for recommender systems" and "FedeRank: User Contro…☆18Jul 28, 2020Updated 5 years ago
- Papers related to the Recommender System from SIGIR 2021 (including the links for Paper PDF, Github Code and Dataset)☆24Jun 9, 2021Updated 4 years ago
- ☆12Mar 31, 2021Updated 5 years ago
- LLM serving cluster simulator☆147Apr 25, 2024Updated last year
- 基于Vue3的主页☆47Sep 10, 2025Updated 7 months ago
- ☆13Jun 3, 2019Updated 6 years ago