neuralmagic / vllmLinks
A high-throughput and memory-efficient inference and serving engine for LLMs
☆13Updated this week
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below
Sorting:
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆73Updated 2 weeks ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- Example implementation of Iteration of Tought - Gives a star if you like the project☆42Updated 6 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆73Updated 8 months ago
- A collection of all available inference solutions for the LLMs☆91Updated 4 months ago
- Self-host LLMs with LMDeploy and BentoML☆20Updated last week
- A collection of example AI programs built using DSPy and maitained by the Langtrace AI team.☆33Updated 7 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆32Updated 2 months ago
- ☆40Updated 2 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆71Updated 4 months ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆115Updated 5 months ago
- ☆52Updated 8 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 5 months ago
- Cray-LM unified training and inference stack.☆22Updated 5 months ago
- ☆37Updated 9 months ago
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆59Updated last year
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆19Updated 2 weeks ago
- ☆34Updated 4 months ago
- Verbosity control for AI agents☆64Updated last year
- ☆45Updated last year
- ☆19Updated 4 months ago
- ☆62Updated 3 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆63Updated 10 months ago
- Train, tune, and infer Bamba model☆130Updated last month
- ☆19Updated 11 months ago
- A locally trained model of Stoney Nakoda has been developed and released. You can access the working model here or train your own instanc…☆10Updated 3 months ago
- Machine Learning Serving focused on GenAI with simplicity as the top priority.☆59Updated last week
- Lego for GRPO☆28Updated last month
- ☆41Updated 3 weeks ago
- Develop, evaluate and monitor LLM applications at scale☆100Updated 7 months ago