A high-throughput and memory-efficient inference and serving engine for LLMs
☆31May 12, 2025Updated last year
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17May 19, 2022Updated 4 years ago
- ☆21Apr 13, 2022Updated 4 years ago
- A small library that wraps Keras models to pickle them.☆14Jul 17, 2018Updated 7 years ago
- 检测内存泄漏的技术框架☆13Jul 13, 2018Updated 7 years ago
- On the effectiveness of adversarial training against common corruptions [UAI 2022]☆30May 16, 2022Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Handy tools & graphics API abstraction for blazing fast prototyping☆10Jan 17, 2024Updated 2 years ago
- spark-simrank scala☆17Jan 19, 2017Updated 9 years ago
- a simple asr system☆14Mar 8, 2018Updated 8 years ago
- ncnn export & infer mobileclip☆22Aug 18, 2025Updated 9 months ago
- Fully open reproduction of DeepSeek-R1☆11Mar 24, 2025Updated last year
- ☆14Mar 30, 2017Updated 9 years ago
- Unofficial docker wrapper for Qualcomm SNPE(Snapdragon Neural Processing Engine) SDK☆11Mar 3, 2022Updated 4 years ago
- Implementation of HAN for Sentiment Classification task from paper "Hierarchical Attention Networks for Document Classification"☆13Aug 5, 2019Updated 6 years ago
- Tengine 管子是用来快速生产 demo 的辅助工具☆12Jul 15, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A Benchmark for Failure Detection under Distribution Shifts in Image Classification☆35Oct 19, 2024Updated last year
- ☆27Jan 17, 2026Updated 4 months ago
- 一个PyTorch实现的五子棋AI项目☆40Mar 16, 2026Updated 2 months ago
- 深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,近30万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系sc…☆18Aug 24, 2019Updated 6 years ago
- ncnn is a high-performance neural network inference framework optimized for the mobile platform☆14May 20, 2022Updated 4 years ago
- ☆43Jan 25, 2024Updated 2 years ago
- ☆10Dec 12, 2020Updated 5 years ago
- The predecessor of CiteLab.☆18Feb 3, 2026Updated 3 months ago
- ☆20Sep 28, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Using ncnn to test the reasoning performance of neural network☆38Jan 18, 2026Updated 4 months ago
- ☆14Apr 16, 2019Updated 7 years ago
- Code and models for the paper Shape-Texture Debiased Neural Network Training (ICLR 2021)☆111Aug 4, 2023Updated 2 years ago
- A Modern Configuration/Registry System designed for deeplearning, with some utils.☆18Apr 27, 2026Updated 3 weeks ago
- HayLM是专门为儿童训练的大模型,通过对InternLM的训练和微调,结合儿童心理学、教育学以及对话风格的数据训练,实现与儿童的智能互动,并在交流过程中不断学习和适应用户特性,成为一个伴随儿童成长的虚拟朋友。☆16Feb 5, 2025Updated last year
- 🐱 ncnn int8 模型量化评估☆14Oct 10, 2022Updated 3 years ago
- A repository of Python & PyTorch scripts which (currently) converts .safetensors models into scaled FP8 variants, utilizing gradient desc…☆27Aug 8, 2025Updated 9 months ago
- Megvii Electric Moped Detector (ONNX based inference)☆13Jul 4, 2021Updated 4 years ago
- 📍 Official repository of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS 2023)☆55Nov 8, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A handy local disk based cache for hot content from remote storage.☆15Aug 4, 2023Updated 2 years ago
- A basic numpy like library for micropython☆18Feb 11, 2020Updated 6 years ago
- Patch Craft: Video Denoising by Deep Modeling and Patch Matching (ICCV 2021)☆51Nov 2, 2021Updated 4 years ago
- ☆28Jun 30, 2025Updated 10 months ago
- 2018年7⽉30⽇-8⽉13⽇持续2周的好未来AI训练营中语⾳情感识别营的项目报告☆33Dec 28, 2018Updated 7 years ago
- An object detection codebase based on MegEngine.☆28Dec 14, 2022Updated 3 years ago
- GPU methods for alpha matting, including cutting edge research algorithms by Philip G. Lee.☆12Jan 8, 2014Updated 12 years ago