reinterpretcat / qwen3-rsLinks

An educational Rust project for exporting and running inference on Qwen3 LLM family

☆34

Alternatives and similar repositories for qwen3-rs

Users that are interested in qwen3-rs are comparing it to the libraries listed below

Sorting:

spyglass-search / memex
Super-simple, fully Rust powered "memory" (doc store + semantic search) for LLM projects, semantic search, etc.
☆62Updated 2 years ago
PlugOvr-ai / PlugOvr
AI Assistant
☆20Updated 7 months ago
samuel-vitorino / lm.rs-webui
Light WebUI for lm.rs
☆24Updated last year
Noveum / ai-gateway
Built for demanding AI workflows, this gateway offers low-latency, provider-agnostic access, ensuring your AI applications run smoothly a…
☆82Updated 5 months ago
jimexist / surya-rs
Rust implementation of Surya
☆63Updated 8 months ago
Oxen-AI / GRPO-With-Cargo-Feedback
This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback
☆111Updated 8 months ago
ShelbyJenkins / candle_embed
A simple, CUDA or CPU powered, library for creating vector embeddings using Candle and models from Hugging Face
☆46Updated last year
marcosomma / orka-reasoning
Orchestrator Kit for Agentic Reasoning - OrKa is a modular AI orchestration system that transforms Large Language Models (LLMs) into comp…
☆60Updated this week
tomsanbear / bitnet-rs
Implementing the BitNet model in Rust
☆41Updated last year
nath1295 / MLX-Textgen
A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.
☆99Updated 4 months ago
phildougherty / llmsh
*NIX SHELL with Local AI/LLM integration
☆24Updated 8 months ago
pixelspark / poly
A single-binary, GPU-accelerated LLM server (HTTP and WebSocket API) written in Rust
☆79Updated last year
MinishLab / model2vec-rs
Official Rust Implementation of Model2Vec
☆141Updated last month
sam-paech / auto-antislop
☆90Updated 2 weeks ago
baehyunsol / ragit
git-like rag pipeline
☆248Updated last week
lucasjinreal / Crane
A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.
☆202Updated 2 weeks ago
gigit0000 / qwen3.c
Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.
☆17Updated 2 months ago
adriancable / qwen3.c
Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.
☆145Updated 4 months ago
mdegans / drama_llama
Yet another `llama.cpp` Rust wrapper
☆12Updated last year
KerfuffleV2 / ggml-sys-bleedingedge
Bleeding edge low level Rust binding for GGML
☆16Updated last year
EndlessReform / fish-speech.rs
A Fish Speech implementation in Rust, with Candle.rs
☆103Updated 5 months ago
Codys12 / airllm
AirLLM 70B inference with single 4GB GPU
☆14Updated 4 months ago
perk11 / large-model-proxy
Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…
☆84Updated 3 weeks ago
jxqu3 / aiui
A simple no-install web UI for Ollama and OAI-Compatible APIs!
☆31Updated 9 months ago
ShelbyJenkins / llm_client
The Easiest Rust Interface for Local LLMs and an Interface for Deterministic Signals from Probabilistic LLM Vibes
☆239Updated 3 months ago
guidance-ai / llgtrt
TensorRT-LLM server with Structured Outputs (JSON) built with Rust
☆60Updated 6 months ago
SamuelTallet / alpine-llama-cpp-server
A lightweight LLaMA.cpp HTTP server Docker image based on Alpine Linux.
☆29Updated last month
KerfuffleV2 / smolrsrwkv
A relatively basic implementation of RWKV in Rust written by someone with very little math and ML knowledge. Supports 32, 8 and 4 bit eva…
☆94Updated 2 years ago
wavify-labs / wavify-sdks
fast state-of-the-art speech models and a runtime that runs anywhere 💥
☆57Updated 5 months ago
rodrigobaron / anthill
☆24Updated 10 months ago