duydvu / triton-inference-server-web-uiLinks
Triton Inference Server Web UI
☆15Updated last year
Alternatives and similar repositories for triton-inference-server-web-ui
Users that are interested in triton-inference-server-web-ui are comparing it to the libraries listed below
Sorting:
- Unofficial golang package for the Triton Inference Server(https://github.com/triton-inference-server/server)☆50Updated this week
- OpenAI compatible API for TensorRT LLM triton backend☆215Updated last year
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆161Updated 2 months ago
- A benchmarking tool for comparing different LLM API providers' DeepSeek model deployments.☆29Updated 5 months ago
- Common source, scripts and utilities shared across all Triton repositories.☆76Updated 2 weeks ago
- Golang web client for Ollama, fast and easy to use.☆29Updated 2 months ago
- This project provides a Flask-based API for generating high-quality text-to-speech (TTS) audio using F5-TTS, a flexible and powerful TTS …☆14Updated last month
- xllamacpp - a Python wrapper of llama.cpp☆58Updated last week
- ☆296Updated last week
- mnn asr demo.☆23Updated 6 months ago
- llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…☆86Updated last year
- Autoscale LLM (vLLM, SGLang, LMDeploy) inferences on Kubernetes (and others)☆275Updated last year
- ☆112Updated last year
- Self-hosted huggingface mirror service. 自建huggingface镜像服务。☆194Updated 2 months ago
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.☆74Updated this week
- The Triton backend for TensorFlow.☆53Updated 3 months ago
- instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…☆53Updated last year
- A library integrating embedding and reranker models from OpenAI, SentenceTransformers etc for semantic search in vector database.☆48Updated 5 months ago
- vLLM Router☆43Updated last year
- ASR using OpenAI capability API `v1/audio/transcriptions` like Groq, SiliconFlow☆32Updated last year
- Whisper inference with TensorRT-LLM☆22Updated 2 years ago
- A diverse, simple, and secure all-in-one LLMOps platform☆108Updated last year
- Port of Funasr's Paraformer model in C/C++☆35Updated last year
- ☆68Updated 2 years ago
- ☆125Updated last year
- ☆56Updated 10 months ago
- CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…☆194Updated last week
- Open Source Text Embedding Models with OpenAI Compatible API☆160Updated last year
- ☆64Updated 5 months ago
- vLLM adapter for a TGIS-compatible gRPC server.☆41Updated this week