Yoosu-L / llmapibenchmarkLinks
The LLM API Benchmark Tool is a flexible Go-based utility designed to measure and analyze the performance of OpenAI-compatible API endpoints across different concurrency levels.
☆68Updated 3 months ago
Alternatives and similar repositories for llmapibenchmark
Users that are interested in llmapibenchmark are comparing it to the libraries listed below
Sorting:
- LM inference server implementation based on *.cpp.☆295Updated 2 months ago
- run DeepSeek-R1 GGUFs on KTransformers☆261Updated 11 months ago
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆238Updated last month
- Self-hosted huggingface mirror service. 自建huggingface镜像服务。☆212Updated 6 months ago
- Convert different model APIs into the OpenAI API format out of the box.☆160Updated last year
- Open Source Text Embedding Models with OpenAI Compatible API☆167Updated last year
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆192Updated last month
- Library for model distillation☆161Updated 5 months ago
- 一个LightRAG的API模拟器,用于在Openwebui中通过自带的Ollama接口使用LightRAG;通过对话时使用前缀,还可以实现lightrag的模式切换。☆30Updated last year
- LLMPerf is a library for validating and benchmarking LLMs☆1,081Updated last year
- Clone of https://r.jina.ai which is deployable locally☆50Updated last year
- gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR、TTS、文生图、图片编辑和文生视频的开源框架。☆244Updated last week
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆78Updated last year
- ☆94Updated 7 months ago
- Model Context Protocol Servers for Milvus☆214Updated last month
- 📚 This is an adapted version of Jina AI's Reader for local deployment using Docker. Convert any URL to an LLM-friendly input with a simp…☆285Updated 6 months ago
- LLM Inference benchmark☆433Updated last year
- LLM Group Chat Framework: chat with multiple LLMs at the same time. 大模型群聊框架:同时与多个大语言模型聊天。☆322Updated 7 months ago
- Open-source observability for your LLM application.☆53Updated last year
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆315Updated 6 months ago
- Dify 1.0 Plugin Support MCP Tools Agent strategies☆129Updated last week
- Common recipes to run vLLM☆364Updated last week
- ☆395Updated this week
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆130Updated 4 months ago
- MCP Server for Bing Search API☆75Updated 10 months ago
- A proxy server for multiple ollama instances with Key security☆582Updated this week
- The main repository for building Pascal-compatible versions of ML applications and libraries.☆169Updated 5 months ago
- Using Groq or OpenAI or Ollama to create o1-like reasoning chains☆291Updated last year
- 添加🚀流式 Web 服务到 GraphRAG,兼容 OpenAI SDK,支持可访问的实体链接🔗,支持建议问题,兼容本地嵌入模型,修复诸多问题。Add streaming web server to GraphRAG, compatible with OpenAI SD…☆263Updated 10 months ago
- Multi-Faceted AI Agent and Workflow Autotuning. Automatically optimizes LangChain, LangGraph, DSPy programs for better quality, lower exe…☆269Updated 8 months ago