micytao / vllm-playgroundLinks
☆120Updated last week
Alternatives and similar repositories for vllm-playground
Users that are interested in vllm-playground are comparing it to the libraries listed below
Sorting:
- ArcticInference: vLLM plugin for high-throughput, low-latency inference☆327Updated this week
- A command-line interface tool for serving LLM using vLLM.☆454Updated this week
- A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM☆140Updated this week
- Common recipes to run vLLM☆256Updated this week
- Benchmark and optimize LLM inference across frameworks with ease☆141Updated 2 months ago
- Route LLM requests to the best model for the task at hand.☆137Updated 3 weeks ago
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆730Updated this week
- Self-host LLMs with vLLM and BentoML☆161Updated last week
- ☆268Updated last week
- ToolOrchestra is an end-to-end RL training framework for orchestrating tools and agentic workflows.☆289Updated this week
- Super basic implementation (gist-like) of RLMs with REPL environments.☆278Updated last month
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆450Updated 3 months ago
- ☆234Updated last week
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆479Updated 3 months ago
- Inference, Fine Tuning and many more recipes with Gemma family of models☆274Updated 4 months ago
- Inference server benchmarking tool☆130Updated 2 months ago
- A Tree Search Library with Flexible API for LLM Inference-Time Scaling☆494Updated last week
- ☆113Updated 3 months ago
- Checkpoint-engine is a simple middleware to update model weights in LLM inference engines☆851Updated 2 weeks ago
- Efficient LLM Inference over Long Sequences☆392Updated 5 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆257Updated this week
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆167Updated 3 months ago
- Real-Time Detection of Hallucinated Entities in Long-Form Generation☆269Updated 3 weeks ago
- ☆79Updated 2 months ago
- An early research stage expert-parallel load balancer for MoE models based on linear programming.☆433Updated 2 weeks ago
- A collection of all available inference solutions for the LLMs☆93Updated 9 months ago
- Utils for Unsloth https://github.com/unslothai/unsloth☆177Updated last week
- Verifiers for LLM Reinforcement Learning☆79Updated 2 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆304Updated last month
- An interface library for RL post training with environments.☆789Updated this week