A high-throughput and memory-efficient inference and serving engine for LLMs
☆13Oct 10, 2025Updated 5 months ago
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SGLang is a fast serving framework for large language models and vision language models.☆20Updated this week
- ☆19Jan 27, 2021Updated 5 years ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence☆10Mar 2, 2025Updated last year
- Expert Specialization MoE Solution based on CUTLASS☆27Jan 19, 2026Updated 2 months ago
- This repository contains the results and code for the MLPerf™ Training v3.0 benchmark.☆12Aug 10, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- This is a repo with a Triton Server deployment template☆24Aug 4, 2024Updated last year
- BERT NER of pytorch editon, including ERNIE implementation.☆11Aug 28, 2019Updated 6 years ago
- DLBlas: clean and efficient kernels☆35Mar 16, 2026Updated 2 weeks ago
- 这是一个王道培训的C++文档,包含了课程中的每次作业提交项目,项目中有.md文件作为详细说明.☆13Oct 9, 2019Updated 6 years ago
- 一个移动终端的轻量级前端类库☆17May 24, 2013Updated 12 years ago
- 一个基于官方API的QQ群聊机器人☆22Nov 27, 2025Updated 4 months ago
- ☆32Apr 19, 2025Updated 11 months ago
- Code for the paper: Dense Reward for Free in Reinforcement Learning from Human Feedback (ICML 2024) by Alex J. Chan, Hao Sun, Samuel Holt…☆38Aug 11, 2024Updated last year
- ☆35Jul 19, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Generate PHPUnit tests from annotations, which you can write in your methods documentation☆11Aug 4, 2021Updated 4 years ago
- Central Point on how to install macOS on Lenovo Ideapad Flex 5 14IIL05 and maybe similar models.☆10May 18, 2023Updated 2 years ago
- Dockerfile to generate Intellij Idea project shared index☆10Jun 2, 2022Updated 3 years ago
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆35Oct 3, 2024Updated last year
- Integration and automation of NS-3 network simulator and Linux Containers☆12Nov 12, 2019Updated 6 years ago
- Cerebro plugin to record and fetch list of items on clipboard☆13Aug 8, 2017Updated 8 years ago
- ☆13Apr 28, 2017Updated 8 years ago
- f8app 集成测试覆盖率收集 demo☆10Jun 5, 2017Updated 8 years ago
- A minimal toolkit for Context Engineering — Select, Compress, and Persist context with pure functions.☆38Jan 20, 2026Updated 2 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A fuzzer for the CAN bus☆18Mar 1, 2025Updated last year
- Rust framework for building high-performance network services☆64Aug 29, 2025Updated 7 months ago
- Provides a lot of basic functionality for our algorithms for the Virtual Network Embedding Problem (VNEP).☆12May 3, 2020Updated 5 years ago
- ☆51Mar 9, 2026Updated 2 weeks ago
- Reproducing Quantization paper PACT☆66Jul 8, 2022Updated 3 years ago
- Python implementation of Discovery V5 Protocol☆19Mar 1, 2022Updated 4 years ago
- Uni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024…☆42Nov 20, 2024Updated last year
- 多集群使用thanos sidecar+MinIO监控告警☆15Feb 20, 2023Updated 3 years ago
- Sprite tool that output sprite images and stylesheet files base on configured images, supports rem☆16Sep 11, 2016Updated 9 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- The official implementation of "LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation"☆22Apr 22, 2025Updated 11 months ago
- A curated list of amazingly awesome JPHP libraries, resources and shiny things.☆18Feb 27, 2020Updated 6 years ago
- sql☆44Feb 4, 2016Updated 10 years ago
- vLLM Router☆55Mar 11, 2024Updated 2 years ago
- [POC] Rootless Containers without `/etc/subuid` and `/etc/subgid`☆19Oct 16, 2020Updated 5 years ago
- 热部署插件HotSecondsServer的一个扩展包,实现对各种第三方框架的热更新,持续更新中 .....☆17Jul 10, 2024Updated last year
- Transformer related optimization, including BERT, GPT☆17Jul 29, 2023Updated 2 years ago