A high-throughput and memory-efficient inference and serving engine for LLMs
☆39Jan 26, 2025Updated last year
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16May 16, 2025Updated 10 months ago
- ☆13Jun 17, 2024Updated last year
- ☆16Mar 22, 2024Updated 2 years ago
- Hypercorn is an ASGI and WSGI Server based on Hyper libraries and inspired by Gunicorn.☆14Jan 12, 2026Updated 2 months ago
- ☆14Jul 5, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- [EMNLP 2024] ”ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models“☆26Jun 24, 2024Updated last year
- ☆45Jun 19, 2025Updated 9 months ago
- a website for accessing many models through api(deepseek、Qwen、Hunyuan etc.)☆16Jul 12, 2025Updated 8 months ago
- ☆26Sep 5, 2024Updated last year
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆16Nov 11, 2024Updated last year
- Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy☆44Feb 8, 2026Updated last month
- my personal mcp server☆13Apr 23, 2025Updated 11 months ago
- ☆37Jan 13, 2026Updated 2 months ago
- A curated list of open-source projects related to MoonshotCoder.☆35May 22, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- 主题:计算认知科学(Computational Cognitive Science)。此仓库诞生背景为IA003结业BP,仍处于萌芽期,内容设置有待转正。下一次大规模更新估计在三四年之后。☆17May 22, 2019Updated 6 years ago
- ☆17Apr 7, 2025Updated 11 months ago
- Chef cookbooks for managing a Ceph cluster☆12Apr 2, 2023Updated 2 years ago
- simplest online-softmax notebook for explain Flash Attention☆16Jan 27, 2026Updated 2 months ago
- Official implementation of "Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought" (NeurIPS 2025)☆38Oct 8, 2025Updated 5 months ago
- ☆39Apr 5, 2024Updated last year
- 2022龙芯杯个人赛三等奖作品☆14Oct 11, 2023Updated 2 years ago
- OLMost every training recipe you need to perform data interventions with the OLMo family of models.☆67Mar 19, 2026Updated last week
- my dockerfiles☆13Mar 16, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆97Feb 11, 2026Updated last month
- [ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"☆22Feb 16, 2025Updated last year
- Ongoing research project for code&math LLMs☆27Jul 4, 2025Updated 8 months ago
- NSCSCC 2020 - Yet Another MIPS Processor☆14Aug 7, 2021Updated 4 years ago
- java implementation of Bert Tokenizer, support output onnx tensor for onnx model inference☆13Sep 4, 2023Updated 2 years ago
- NSCSCC “龙芯杯” 2024 个人赛 LoongArch 赛道三等奖☆14Aug 17, 2024Updated last year
- 一个简洁高效的 AI 命令行助手,支持对话、命令生成、文件处理。☆17Sep 16, 2025Updated 6 months ago
- Code of PyTorch implementation of 'Weakly Supervised Semantic Segmentation via Box-driven Masking and Filling Rate Shifting'☆16Aug 8, 2023Updated 2 years ago
- Implementation of paper 'Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference' [NeurIPS'24…☆26Jun 14, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- React app for inspecting, building and debugging with the Realtime API☆11Nov 5, 2024Updated last year
- ☆25Mar 8, 2026Updated 2 weeks ago
- this is a trained yolov8n network that only detects people, at "eye-height", trained in a super basic way on COCO☆13Dec 18, 2023Updated 2 years ago
- A toolkit to assess data privacy in LLMs (under development)☆70Jan 2, 2025Updated last year
- qwen-nsa☆87Oct 14, 2025Updated 5 months ago
- 2023龙芯杯mips赛道作品☆14Dec 23, 2023Updated 2 years ago
- The implementation of Text Classification with Negative Supervision (ACL, 2020)☆10Oct 8, 2020Updated 5 years ago