A high-throughput and memory-efficient inference and serving engine for LLMs
☆55Dec 11, 2023Updated 2 years ago
Alternatives and similar repositories for vllm-release
Users that are interested in vllm-release are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain…☆18Feb 12, 2024Updated 2 years ago
- ☆24Jul 24, 2023Updated 2 years ago
- Detecting Drift in a Diabetes Dataset using Taipy☆12May 19, 2025Updated last year
- ☆20May 29, 2026Updated last month
- PyTorch implementation of a self-attentive speaker embedding☆17Sep 24, 2019Updated 6 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆13Aug 7, 2021Updated 4 years ago
- BH hackathon☆14Apr 4, 2024Updated 2 years ago
- meta_llama_2finetuned_text_generation_summarization☆21Jul 21, 2023Updated 2 years ago
- ☆19Aug 23, 2025Updated 10 months ago
- torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.☆25Mar 29, 2024Updated 2 years ago
- Tiny evaluation of leading LLMs on competitive programming problems☆14Apr 10, 2026Updated 2 months ago
- OpenSource deployment made easy☆10Jun 13, 2015Updated 11 years ago
- ☆16Mar 12, 2026Updated 3 months ago
- AI Powered Dockerfile Generator Using Llama3.1 with GROQ☆11Oct 24, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆20Oct 23, 2023Updated 2 years ago
- Sentence Embedding as a Service☆15Jun 30, 2025Updated last year
- A common protocol for AI agent tools☆10Oct 21, 2024Updated last year
- HTML/XML aware reverse proxy☆17Feb 16, 2026Updated 4 months ago
- ☆866Dec 8, 2023Updated 2 years ago
- ☆11Apr 3, 2023Updated 3 years ago
- Official inference library for Mistral models☆10,823Jun 16, 2026Updated 2 weeks ago
- Repo for the testing-genai workshop☆14May 8, 2025Updated last year
- Python 3 compatible softphone with support for audio streaming.☆14Apr 18, 2024Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A minimal yet unstoppable blueprint for multi-agent AI—anchored by the rare, far-reaching “Multi-Agent AI DAO” (2017 Prior Art)—empowerin…☆36Jan 11, 2025Updated last year
- LLM model runway server☆13Sep 13, 2023Updated 2 years ago
- Convert source code to LLM ready knowledge base☆34Dec 30, 2025Updated 6 months ago
- Research Software Design by Example☆13Sep 7, 2025Updated 9 months ago
- Databutton MCP Server☆27Apr 7, 2025Updated last year
- ☆164Mar 5, 2021Updated 5 years ago
- Source and documentation for development of autopilot for a surface vessel☆15Jun 3, 2015Updated 11 years ago
- Code-Langchain☆44Feb 20, 2024Updated 2 years ago
- SmartThings Lutron Integration