tensorchord / deepseek-api-arenaLinks
A benchmarking tool for comparing different LLM API providers' DeepSeek model deployments.
☆28Updated 2 months ago
Alternatives and similar repositories for deepseek-api-arena
Users that are interested in deepseek-api-arena are comparing it to the libraries listed below
Sorting:
- This repository contains statistics about the AI Infrastructure products.☆18Updated 3 months ago
- ☆28Updated 2 months ago
- The driver for LMCache core to run in vLLM☆42Updated 4 months ago
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆53Updated 7 months ago
- CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…☆151Updated this week
- vLLM Router☆29Updated last year
- patches for huggingface transformers to save memory☆23Updated 3 weeks ago
- Workflow Defined Engine☆24Updated 2 months ago
- KV cache store for distributed LLM inference☆269Updated 2 weeks ago
- A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.☆87Updated last month
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆133Updated last year
- ☆13Updated 2 months ago
- ☆86Updated 3 months ago
- Deploy ChatGLM on Modelz☆15Updated 2 years ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆100Updated last year
- ☆26Updated 3 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆16Updated last year
- ☆54Updated 7 months ago
- vLLM adapter for a TGIS-compatible gRPC server.☆32Updated this week
- ☆71Updated last month
- ☆30Updated 9 months ago
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆20Updated last month
- xet client tech, used in huggingface_hub☆118Updated this week
- An Envoy inspired, ultimate LLM-first gateway for LLM serving and downstream application developers and enterprises☆20Updated 2 months ago
- A simple calculation for LLM MFU.☆38Updated 3 months ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated last year
- ☆50Updated last month
- Auto Thinking Mode switch for Qwen3 in Open webui☆65Updated last month
- Chinese tokens in tiktoken tokenizers.☆32Updated last year
- ByteCheckpoint: An Unified Checkpointing Library for LFMs☆219Updated 2 months ago