Evaluate how vLLM and SGLang perform when running a small LLM model on a mid-range NVIDIA GPU
☆20Apr 2, 2026Updated last week
Alternatives and similar repositories for vllm-sglang-perf
Users that are interested in vllm-sglang-perf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Python wrapper for the ROUGE summarization evaluation package☆14Aug 9, 2017Updated 8 years ago
- Intel Gaudi's Megatron DeepSpeed Large Language Models for training☆18Dec 19, 2024Updated last year
- The repository for the paper "Predicting in-hospital mortality by combining clinical notes with time-series data"☆12May 23, 2021Updated 4 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆17Apr 3, 2026Updated last week
- ☆12Aug 2, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Recursive Self-Aggregation evals on ARC-AGI☆29Jan 26, 2026Updated 2 months ago
- Local lightning-fast semantic code search built for agents☆41Mar 16, 2026Updated 3 weeks ago
- A branch of marss with DRAMSim hooks☆18Aug 22, 2013Updated 12 years ago
- ☆15Oct 4, 2024Updated last year
- ☆19Aug 23, 2025Updated 7 months ago
- The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"☆31Mar 26, 2026Updated 2 weeks ago
- Fixes the rotation of the images based on EXIF data☆15Updated this week
- Implementation for Decision-focused Summarization (EMNLP2021)☆12Mar 14, 2022Updated 4 years ago
- Contains the content for Tableau's OSS contribution guidelines☆11Nov 24, 2025Updated 4 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- libtpms / swtpm software emulation of a Trusted Platform Module (TPM 1.2 and TPM 2.0) compile script☆13Sep 16, 2020Updated 5 years ago
- ☆19Dec 9, 2025Updated 4 months ago
- Code release for "TempLM: Distilling Language Models into Template-Based Generators"☆14Jul 21, 2022Updated 3 years ago
- Elixir rose trees and zippers☆12Jun 14, 2023Updated 2 years ago
- Authenticated Knowledge & Trust Architecture for AI Agents☆32Dec 17, 2025Updated 3 months ago
- Verify that any MCP server is running the intended and untampered code via hardware attestation.☆18Mar 28, 2025Updated last year
- k8s CSI driver for FastCFS☆13Mar 17, 2024Updated 2 years ago
- A scikit-learn compliant implementation of Monroe et al.'s Fightin' Words analysis method.☆11Mar 10, 2019Updated 7 years ago
- ☆39Oct 23, 2025Updated 5 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆15Sep 22, 2024Updated last year
- ☆12Jul 31, 2025Updated 8 months ago
- Customized Claude Code system prompts for use with tweakcc — ~48k bytes smaller, 30% faster, same accuracy☆35Nov 23, 2025Updated 4 months ago
- This repository helps you evaluate your models on the FreshStack benchmark!☆34Dec 9, 2025Updated 4 months ago
- A lightweight, self-hosted infrastructure layer for deploying and managing LLM agents as resilient microservices. Features automatic r…☆18Aug 4, 2025Updated 8 months ago
- Code for the MTEB Arena☆24Jul 2, 2025Updated 9 months ago
- Tool to perform paired evaluation of automatic systems☆13Oct 20, 2021Updated 4 years ago
- ☆21Updated this week
- Deep Weighted Averaging Classifiers☆22Feb 4, 2019Updated 7 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Library for action model acquisition from state trace data.☆25Jan 7, 2025Updated last year
- Connect to a LSF main node directly or trough a ssh jump node, launch a jupyter notebook via bsub and open automatically a tunnel. The n…☆20Oct 27, 2021Updated 4 years ago
- Use common pre-trained ML models in Deno!☆19Nov 21, 2021Updated 4 years ago
- kubeflow example☆18Jun 26, 2021Updated 4 years ago
- Faster R-CNN, an MXNet implementation with distributed implementation and data parallelization.☆36Jan 5, 2017Updated 9 years ago
- ☆24Nov 22, 2022Updated 3 years ago
- Large Language Model Text Generation Inference on Habana Gaudi☆34Mar 20, 2025Updated last year