EvilFreelancer / docker-llama.cpp-rpcLinks
Данный проект основан на llama.cpp и компилирует только RPC-сервер, а так же вспомогательные утилиты, работающие в режиме RPC-клиента, необходимые для реализации распределённого инференса конвертированных в GGUF формат Больших Языковых Моделей (БЯМ) и Эмбеддинговых Моделей.
☆23Updated 7 months ago
Alternatives and similar repositories for docker-llama.cpp-rpc
Users that are interested in docker-llama.cpp-rpc are comparing it to the libraries listed below
Sorting:
- whisper.cpp HTTP transcription server with OpenAI-like API in Docker☆27Updated 5 months ago
- Tools and agents for automated research.☆47Updated last month
- Dialoqbase Lite is a Chrome extension that offers a web-based UI and a side panel, Copilot, designed specifically for almost all AI provi…☆43Updated 8 months ago
- Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆50Updated 7 months ago
- 🔥 LitLytics - an affordable, simple analytics platform that leverages LLMs to automate data analysis☆103Updated last year
- OpenAPI-like API-server for voice generation (TTS) based on fish-speech-1.5 model.☆28Updated 7 months ago
- ☆31Updated last year
- Complex RAG backend☆29Updated last year
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆130Updated 2 years ago
- AI agent to automatically check grammar and spelling on documentation files☆93Updated last month
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆17Updated last year
- High-performance lightweight proxy and load balancer for LLM infrastructure. Intelligent routing, automatic failover and unified model di…☆124Updated 2 weeks ago
- Kroko ASR - Speech-to-text☆123Updated 2 months ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆59Updated last year
- Download models from the Ollama library, without Ollama☆118Updated last year
- From-scratch implementation of OpenAI's GPT-OSS model in Python. No Torch, No GPUs.☆107Updated 2 months ago
- LLM Chat is an open-source serverless alternative to ChatGPT.☆35Updated last year
- This project provides a Flask-based API for generating high-quality text-to-speech (TTS) audio using F5-TTS, a flexible and powerful TTS …☆14Updated 4 months ago
- Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.☆44Updated last year
- Thin wrapper around OpenAI Whisper API with streaming support☆86Updated last month
- Self-host LLMs with vLLM and BentoML☆163Updated last month
- ☆14Updated last year
- A QT GUI for large language models☆38Updated 2 years ago
- Sentence Transformers API: An OpenAI compatible embedding API server☆69Updated last year
- Training and data processing code for Saiga☆54Updated last week
- PyPlexitas is an open-source Python CLI alternative to Perplexity AI, designed to perform web searches, scrape content, generate embeddin…☆36Updated last year
- Web Interface for Vision Language Models Including InternVLM2☆25Updated last year
- ☆14Updated 11 months ago
- A simple Next.js frontend to explore your local weaviate collections and data☆39Updated 6 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆86Updated this week