LlamaEdge / sd-api-serverLinks
The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge
☆24Updated 10 months ago
Alternatives and similar repositories for sd-api-server
Users that are interested in sd-api-server are comparing it to the libraries listed below
Sorting:
- A RAG API server written in Rust following OpenAI specs☆60Updated 9 months ago
- The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge☆26Updated 10 months ago
- OpenAI compatible API for serving LLAMA-2 model☆218Updated 2 years ago
- A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.☆229Updated 3 weeks ago
- The MCP enterprise actors-based server or mcp-ectors for short☆31Updated 7 months ago
- 🔥🔥 Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast, realtime TTS with high quality you ever have.☆685Updated last week
- ☆256Updated 3 months ago
- Library for doing RAG☆81Updated 3 weeks ago
- Fast serverless LLM inference, in Rust.☆108Updated 2 months ago
- Blazingly fast inference of diffusion models.☆118Updated 9 months ago
- The Google mediapipe AI library. Write AI inference applications for image recognition, text classification, audio / video processing and…☆225Updated last week
- 🦀 A Pure Rust Framework For Building AGI (WIP).☆111Updated last month
- A single-binary, GPU-accelerated LLM server (HTTP and WebSocket API) written in Rust☆79Updated 2 years ago
- WebAssembly (Wasm) Build and Bindings for llama.cpp☆285Updated last year
- Rust bindings for OpenNMT/CTranslate2☆49Updated last week
- LLM Proxy☆12Updated last year
- Rust bindings to https://github.com/k2-fsa/sherpa-onnx☆275Updated 2 months ago
- High-level, optionally asynchronous Rust bindings to llama.cpp☆240Updated last year
- The Easiest Rust Interface for Local LLMs and an Interface for Deterministic Signals from Probabilistic LLM Vibes☆239Updated 5 months ago
- A simple, CUDA or CPU powered, library for creating vector embeddings using Candle and models from Hugging Face☆46Updated last year
- Backend for https://github.com/pinokiocomputer/pinokio☆60Updated last week
- ☆441Updated this week
- Embed WasmEdge functions in a Rust host app☆33Updated last year
- The all-in-one RWKV runtime box with embed, RAG, AI agents, and more.☆596Updated 3 months ago
- A Fish Speech implementation in Rust, with Candle.rs☆106Updated 7 months ago
- AI Assistant☆20Updated 9 months ago
- Low rank adaptation (LoRA) for Candle.☆169Updated 9 months ago
- AI gateway and observability server written in Rust. Designed to help optimize multi-agent workflows.☆65Updated last year
- Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.☆574Updated this week
- Run LLaMA inference on CPU, with Rust 🦀🚀🦙☆33Updated last year