neuralmagic / vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆12Updated this week
Alternatives and similar repositories for vllm:
Users that are interested in vllm are comparing it to the libraries listed below
- Cray-LM unified training and inference stack.☆22Updated 2 months ago
- Example implementation of Iteration of Tought - Gives a star if you like the project☆40Updated 3 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated 11 months ago
- purpose of this repo is to Implement LLMOPs as shared in Deeplearning AI course☆16Updated this week
- Python Server for C3 AI app. A project that brings the power of Large Language Models (LLM) and Retrieval-Augmented Generation (RAG) with…☆23Updated last year
- ☆51Updated 5 months ago
- ☆17Updated 2 months ago
- Machine Learning Serving focused on GenAI with simplicity as the top priority.☆58Updated 2 weeks ago
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated 10 months ago
- ☆16Updated 11 months ago
- The original BabyAGI, updated with LiteLLM and no vector database reliance (csv instead)☆21Updated 6 months ago
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆21Updated 9 months ago
- Verbosity control for AI agents☆62Updated 11 months ago
- An open source code of the GitHub Copilot Workspace☆11Updated 10 months ago
- Official repository for Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions☆17Updated this week
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆66Updated 5 months ago
- An intelligent code optimization system leveraging AI analysis, automated refactoring, and test generation. Built with DSPy and Gradio, i…☆18Updated 2 months ago
- ☆20Updated last year
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆19Updated last week
- ☆48Updated 5 months ago
- Ultra Fast Multi-Modality Vector Database☆18Updated last year
- Code interpreter support for o1☆32Updated 7 months ago
- A forest of autonomous agents.☆19Updated 2 months ago
- ☆41Updated 11 months ago
- ☆19Updated 8 months ago
- A locally trained model of Stoney Nakoda has been developed and released. You can access the working model here or train your own instanc…☆10Updated 2 weeks ago
- A framework making it effortless to convert any llm model into a reasoning agent like o1 or DeepSeek's r1☆20Updated 2 weeks ago
- ☆11Updated 9 months ago
- Streamlit app for recommending eval functions using prompt diffs☆27Updated last year
- Example code using the DSPy framework.☆18Updated 10 months ago