A high-throughput and memory-efficient inference and serving engine for LLMs
☆31May 12, 2025Updated last year
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆21Apr 13, 2022Updated 4 years ago
- ☆13Jun 10, 2022Updated 4 years ago
- ☆16Jun 4, 2025Updated last year
- PyTorch implementation of our CVPR2023 paper "OpenMix: Exploring Out-of-Distribution samples for Misclassification Detection"☆27Oct 16, 2023Updated 2 years ago
- On the effectiveness of adversarial training against common corruptions [UAI 2022]☆30May 16, 2022Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Handy tools & graphics API abstraction for blazing fast prototyping☆10Jan 17, 2024Updated 2 years ago
- Fully open reproduction of DeepSeek-R1☆11Mar 24, 2025Updated last year
- Real-time AI video segmentation of USB camera and streaming over HTTP☆13Apr 23, 2025Updated last year
- ☆14Mar 30, 2017Updated 9 years ago
- Unofficial docker wrapper for Qualcomm SNPE(Snapdragon Neural Processing Engine) SDK☆11Mar 3, 2022Updated 4 years ago
- ☆10Mar 24, 2024Updated 2 years ago
- Tengine 管子是用来快速生产 demo 的辅助工具☆11Jul 15, 2021Updated 4 years ago
- A Benchmark for Failure Detection under Distribution Shifts in Image Classification☆36Oct 19, 2024Updated last year
- Easy to download and parse version of the Smartdoc 2015 - Challenge 1 dataset.☆18Mar 5, 2018Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 一个PyTorch实现的五子棋AI项目☆42Mar 16, 2026Updated 2 months ago
- ☆15Apr 15, 2022Updated 4 years ago
- 基于ncnn的android端的enet分割☆17Mar 29, 2020Updated 6 years ago
- ncnn is a high-performance neural network inference framework optimized for the mobile platform☆14May 20, 2022Updated 4 years ago
- ☆43Jan 25, 2024Updated 2 years ago
- ☆20Sep 28, 2024Updated last year
- 《万界道友》是一款以 AIGC 驱动、高自由度文字体验、修仙世界观为核心的开源游戏。在这里,你将以普通修士之身,借功法、灵根、神通、法宝与奇遇,一步步推演自己的修行之路。☆64Updated this week
- Using ncnn to test the reasoning performance of neural network☆38Jan 18, 2026Updated 4 months ago
- ncnn的Rust实现,一个轻量级的神经网络推理框架,本仓库分离了静态库,使其适合跨平台编译☆21Sep 12, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A repository to store my cuda codes, including some common-used kernels.☆12Sep 19, 2021Updated 4 years ago
- A Modern Configuration/Registry System designed for deeplearning, with some utils.☆18Apr 27, 2026Updated last month
- ☆18Nov 30, 2022Updated 3 years ago
- HayLM是专门为儿童训练的大模型,通过对InternLM的训练和微调,结合儿童心理学、教育学以及对话风格的数据训练,实现与儿童的智能互动,并在交流过程中不断学习和适应用户特性,成为一个伴随儿童成长的虚拟朋友。☆16Feb 5, 2025Updated last year
- 🐱 ncnn int8 模型量化评估☆14Oct 10, 2022Updated 3 years ago
- A repository of Python & PyTorch scripts which (currently) converts .safetensors models into scaled FP8 variants, utilizing gradient desc…☆26Aug 8, 2025Updated 10 months ago
- Call ncnn from Fortran☆18Dec 18, 2022Updated 3 years ago
- Megvii Electric Moped Detector (ONNX based inference)☆13Jul 4, 2021Updated 4 years ago
- 📍 Official repository of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS 2023)☆56Nov 8, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆54Sep 11, 2021Updated 4 years ago
- A handy local disk based cache for hot content from remote storage.☆15Aug 4, 2023Updated 2 years ago
- A list of papers that studies out-of-distribution (OOD) detection and misclassification detection (MisD)☆53Oct 6, 2023Updated 2 years ago
- An object detection codebase based on MegEngine.☆28Dec 14, 2022Updated 3 years ago
- MegEngine build with cu11x☆17Mar 13, 2023Updated 3 years ago
- TVMScript kernel for deformable attention☆25Dec 15, 2021Updated 4 years ago
- 不规则四边形填充平面 - demo & 算法解释☆24Aug 16, 2021Updated 4 years ago