lucasjinreal / Crane
A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.
☆93Updated 3 weeks ago
Alternatives and similar repositories for Crane:
Users that are interested in Crane are comparing it to the libraries listed below
- Blazingly fast inference of diffusion models.☆106Updated 2 weeks ago
- ☆68Updated last month
- Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. F…☆75Updated this week
- ☆54Updated this week
- Guaranteed Structured Output from any Language Model via Hierarchical State Machines☆124Updated this week
- ☆24Updated 2 months ago
- Command-line personal assistant using your favorite proprietary or local models with access to over 30+ tools☆105Updated 2 weeks ago
- Open source LLM UI, compatible with all local LLM providers.☆173Updated 6 months ago
- A pipeline parallel training script for LLMs.☆137Updated 2 weeks ago
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆21Updated 8 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆55Updated last month
- ☆84Updated 3 months ago
- Kyutai with an "eye"☆182Updated 3 weeks ago
- idea: https://github.com/nyxkrage/ebook-groupchat/☆86Updated 8 months ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 4 months ago
- ☆197Updated this week
- A real-time speech-to-speech chatbot powered by Whisper Small, Llama 3.2, and Kokoro-82M.☆218Updated 2 months ago
- fast state-of-the-art speech models and a runtime that runs anywhere 💥☆55Updated 2 months ago
- A Fish Speech implementation in Rust, with Candle.rs☆77Updated last month
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆76Updated 4 months ago
- High-performance Text-to-Speech server with OpenAI-compatible API, 8 voices, emotion tags, and modern web UI. Optimized for RTX GPUs.☆232Updated this week
- Run Orpheus 3B Locally With LM Studio☆362Updated 3 weeks ago
- Super simple python connectors for llama.cpp, including vision models (Gemma 3, Qwen2-VL). Compile llama.cpp and run!☆23Updated 2 weeks ago
- Easily view and modify JSON datasets for large language models☆73Updated last month
- automatically quant GGUF models☆167Updated last week
- ☆130Updated last week
- A cutting-edge Cascading voice assistant combining real-time speech recognition, AI reasoning, and neural text-to-speech capabilities.☆57Updated last week
- Easy to use interface for the Whisper model optimized for all GPUs!☆79Updated last week
- Whisper STT + Orpheus TTS + Gemma 3 using LM Studio to create a virtual assistant.☆41Updated 2 weeks ago
- 🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library☆24Updated 5 months ago