A high-throughput and memory-efficient inference and serving engine for LLMs
☆39Aug 30, 2025Updated 9 months ago
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Jun 7, 2025Updated 11 months ago
- ☆72May 12, 2026Updated 3 weeks ago
- ☆18Nov 3, 2025Updated 7 months ago
- A suite of multimodal language models that are powerful and efficient☆19Jan 13, 2025Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆15May 26, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆12Mar 21, 2024Updated 2 years ago
- EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarization☆13Mar 20, 2025Updated last year
- ☆13Jul 25, 2024Updated last year
- teler Caddy integrates the powerful security features of teler WAF into the Caddy web server, ensuring your web servers remain secure and…☆17Feb 24, 2025Updated last year
- my dockerfiles☆13May 20, 2026Updated 2 weeks ago
- Community maintained hardware plugin for vLLM on Ascend☆2,158Updated this week
- Band-limited Training and Inference for Convolutional Neural Networks☆20Nov 21, 2022Updated 3 years ago
- React app for inspecting, building and debugging with the Realtime API☆11Nov 5, 2024Updated last year
- deduplication☆15Feb 20, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ThinK: Thinner Key Cache by Query-Driven Pruning☆30Feb 11, 2025Updated last year
- An interactive companion toy that engages kids with storytelling, singing, and encouragement for physical activities using advanced AI t…☆10Oct 15, 2024Updated last year
- (AAAI24 oral) Implementation of RPPO(Risk-sensitive PPO) and RPBT(Population-based self-play with RPPO)☆12May 22, 2023Updated 3 years ago
- Waf based on caddy2☆20Jul 21, 2022Updated 3 years ago
- ☆17May 5, 2024Updated 2 years ago
- Freeswitch Speech-To-Text module☆17Mar 14, 2026Updated 2 months ago
- rdiv!(::AbstractMatrix, ::UpperTriangular) and ldiv!(::LowerTriangular, ::AbstractMatrix)☆12Nov 18, 2024Updated last year
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆72Jun 1, 2025Updated last year
- ☆75Mar 26, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Socat一键安装脚本,可转发TCP和UDP流量,支持IPv4和IPv6☆14Jul 25, 2025Updated 10 months ago
- vLLM for embedding tasks using Original LLMs (Qwen2, LLaMA)☆29Sep 9, 2024Updated last year
- Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"☆13Nov 3, 2022Updated 3 years ago
- Julia implementation of flash-attention operation for neural networks.☆11May 31, 2023Updated 3 years ago
- ☆15Mar 11, 2025Updated last year
- Code corresponding to the paper: "On the Robustness of Vision Transformers": https://arxiv.org/abs/2104.02610☆25Dec 16, 2025Updated 5 months ago
- An LLM inference engine, written in C++☆19Mar 30, 2026Updated 2 months ago
- ☆79Dec 15, 2023Updated 2 years ago
- ☆15Jul 11, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A dynamic GPU memory allocator, suitable for warp synchronized scenarios.☆11Aug 20, 2019Updated 6 years ago
- NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks☆20May 10, 2022Updated 4 years ago
- a ros node using face_net do face_recognition☆12Jul 27, 2016Updated 9 years ago
- Automatic differentiation of FEniCS and Firedrake models in Julia☆14Mar 21, 2021Updated 5 years ago
- [EMNLP 2025] CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward☆69Aug 10, 2025Updated 9 months ago
- Sparse symmetric indefinite solver implemented with a runtime system☆13May 11, 2020Updated 6 years ago
- 新词发现分布式机器学习算法。☆15Jul 21, 2014Updated 11 years ago