tensorchord / deepseek-api-arenaLinks
A benchmarking tool for comparing different LLM API providers' DeepSeek model deployments.
☆29Updated 2 months ago
Alternatives and similar repositories for deepseek-api-arena
Users that are interested in deepseek-api-arena are comparing it to the libraries listed below
Sorting:
- This repository contains statistics about the AI Infrastructure products.☆18Updated 3 months ago
- The driver for LMCache core to run in vLLM☆41Updated 4 months ago
- ☆27Updated last month
- Distributed KV cache coordinator☆31Updated 2 weeks ago
- 🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.☆30Updated 5 months ago
- Turn PostgreSQL into your search engine in a Pythonic way.☆43Updated 3 weeks ago
- Deploy ChatGLM on Modelz☆15Updated 2 years ago
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆53Updated 6 months ago
- KV cache store for distributed LLM inference☆254Updated last week
- Workflow Defined Engine☆24Updated last month
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆100Updated last year
- An Envoy inspired, ultimate LLM-first gateway for LLM serving and downstream application developers and enterprises☆20Updated last month
- A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.☆81Updated 2 weeks ago
- vLLM Router☆29Updated last year
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆20Updated 3 weeks ago
- vLLM performance dashboard☆30Updated last year
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆132Updated 11 months ago
- ☆108Updated last year
- A collection of reproducible inference engine benchmarks☆31Updated last month
- Auto Thinking Mode switch for Qwen3 in Open webui☆61Updated 3 weeks ago
- ☆16Updated 2 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆16Updated last year
- ☆85Updated 2 months ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆127Updated last month
- ☆12Updated last month
- ☆19Updated last year
- Elastic Deep Learning Training based on Kubernetes by Leveraging EDL and Volcano☆32Updated 2 years ago
- ☆47Updated 2 weeks ago
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆158Updated 8 months ago
- vLLM adapter for a TGIS-compatible gRPC server.☆30Updated this week