A high-throughput and memory-efficient inference and serving engine for LLMs
☆30May 12, 2025Updated 10 months ago
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16May 19, 2022Updated 3 years ago
- ☆21Apr 13, 2022Updated 3 years ago
- ☆13Jun 10, 2022Updated 3 years ago
- ☆13May 9, 2023Updated 2 years ago
- PyTorch implementation of our CVPR2023 paper "OpenMix: Exploring Out-of-Distribution samples for Misclassification Detection"☆27Oct 16, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- On the effectiveness of adversarial training against common corruptions [UAI 2022]☆30May 16, 2022Updated 3 years ago
- ncnn export & infer mobileclip☆21Aug 18, 2025Updated 7 months ago
- Awesome Resources about MegEngine☆16Mar 2, 2023Updated 3 years ago
- Fully open reproduction of DeepSeek-R1☆11Mar 24, 2025Updated last year
- ☆14Mar 30, 2017Updated 8 years ago
- Unofficial docker wrapper for Qualcomm SNPE(Snapdragon Neural Processing Engine) SDK☆10Mar 3, 2022Updated 4 years ago
- ☆20Jan 17, 2026Updated 2 months ago
- The Bytepiece Tokenizer Implemented in Rust.☆14Nov 28, 2023Updated 2 years ago
- 基于select模型的多线程、高并发服务器,同时实现了内存池+对象池☆10Nov 4, 2019Updated 6 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 微信(逆向)信息获取DLL☆13Sep 17, 2019Updated 6 years ago
- ☆12Apr 19, 2022Updated 3 years ago
- 一个PyTorch实现的五子棋AI项目☆38Mar 16, 2026Updated last week
- ☆15Apr 15, 2022Updated 3 years ago
- 基于ncnn的android端的enet分割☆17Mar 29, 2020Updated 5 years ago
- 《万界道友》是一款以 AIGC 驱动、高自由度文字体验、修仙世界观为核心的开源游戏。在这里,你将以普通修士之身,借功法、灵根、神通、法宝与奇遇,一步步推演自己的修行之路。☆42Updated this week
- An Android Application for GLCC☆11Sep 30, 2022Updated 3 years ago
- ncnn is a high-performance neural network inference framework optimized for the mobile platform☆14May 20, 2022Updated 3 years ago
- The predecessor of CiteLab.☆18Feb 3, 2026Updated last month
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 🐱 ncnn int8 模型量化评估☆14Oct 10, 2022Updated 3 years ago
- Winning solution for ESRI Data Science Challenge 2019 - Hacker Earth☆27Oct 16, 2019Updated 6 years ago
- ☆14Apr 16, 2019Updated 6 years ago
- Code and models for the paper Shape-Texture Debiased Neural Network Training (ICLR 2021)☆111Aug 4, 2023Updated 2 years ago
- ☆18Nov 30, 2022Updated 3 years ago
- HayLM是专门为儿童训练的大模型,通过对InternLM的训练和微调,结合儿童心理学、教育学以及对话风格的数据训练,实现与儿童的智能互动,并在交流过程中不断学习和适应用户特性,成为一个伴随儿童成长的虚拟朋友。☆16Feb 5, 2025Updated last year
- A repository of Python & PyTorch scripts which (currently) converts .safetensors models into scaled FP8 variants, utilizing gradient desc…☆27Aug 8, 2025Updated 7 months ago
- Megvii Electric Moped Detector (ONNX based inference)☆13Jul 4, 2021Updated 4 years ago
- 📍 Official repository of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS 2023)☆56Nov 8, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- useful dotfiles included vim, zsh, tmux and vscode☆19Feb 13, 2026Updated last month
- Jax implementation of VIT-VQGAN☆10Jan 25, 2024Updated 2 years ago
- A handy local disk based cache for hot content from remote storage.☆15Aug 4, 2023Updated 2 years ago
- A basic numpy like library for micropython☆18Feb 11, 2020Updated 6 years ago
- Code for our paper "Informative Dropout for Robust Representation Learning: A Shape-bias Perspective" (ICML 2020)☆125Dec 8, 2022Updated 3 years ago
- An object detection codebase based on MegEngine.☆28Dec 14, 2022Updated 3 years ago
- MegEngine build with cu11x☆17Mar 13, 2023Updated 3 years ago