reinterpretcat / qwen3-rsLinks
An educational Rust project for exporting and running inference on Qwen3 LLM family
☆28Updated last month
Alternatives and similar repositories for qwen3-rs
Users that are interested in qwen3-rs are comparing it to the libraries listed below
Sorting:
- AI Assistant☆20Updated 5 months ago
- A simple, CUDA or CPU powered, library for creating vector embeddings using Candle and models from Hugging Face☆39Updated last year
- Light WebUI for lm.rs☆24Updated 11 months ago
- Super-simple, fully Rust powered "memory" (doc store + semantic search) for LLM projects, semantic search, etc.☆62Updated last year
- A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.☆159Updated last month
- Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.☆18Updated 2 weeks ago
- Official Rust Implementation of Model2Vec☆135Updated last week
- git-like rag pipeline☆244Updated this week
- Implementing the BitNet model in Rust☆39Updated last year
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆95Updated 2 months ago
- This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback☆104Updated 6 months ago
- Build tools for LLMs in Rust using Model Context Protocol☆38Updated 6 months ago
- Rust implementation of Surya☆60Updated 6 months ago
- ☆24Updated 7 months ago
- *NIX SHELL with Local AI/LLM integration☆23Updated 6 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆82Updated last week
- Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.☆110Updated 2 months ago
- AirLLM 70B inference with single 4GB GPU☆14Updated 2 months ago
- 33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU☆13Updated last year
- A Fish Speech implementation in Rust, with Candle.rs☆97Updated 3 months ago
- Built for demanding AI workflows, this gateway offers low-latency, provider-agnostic access, ensuring your AI applications run smoothly a…☆76Updated 3 months ago
- A single-binary, GPU-accelerated LLM server (HTTP and WebSocket API) written in Rust☆79Updated last year
- Library for doing RAG☆76Updated last week
- Blazing-fast rust implementation of Sesame's Conversational Speech Model (CSM)☆58Updated 2 weeks ago
- A lightweight LLaMA.cpp HTTP server Docker image based on Alpine Linux.☆29Updated 3 weeks ago
- A simple, "Ollama-like" tool for managing and running GGUF language models from your terminal.☆21Updated last week
- An MCP-enabled Qwen3 0.6B demo with adjustable thinking budget, all in your browser!☆25Updated 3 months ago
- auto-rust is an experimental project that automatically generate Rust code with LLM (Large Language Models) during compilation, utilizing…☆41Updated 10 months ago
- ollama like cli tool for MLX models on huggingface (pull, rm, list, show, serve etc.)☆101Updated this week
- fast state-of-the-art speech models and a runtime that runs anywhere 💥☆56Updated 3 months ago