☆42Aug 2, 2025Updated 7 months ago
Alternatives and similar repositories for Qwen_MOE_C
Users that are interested in Qwen_MOE_C are comparing it to the libraries listed below
Sorting:
- ☆23Jan 14, 2025Updated last year
- A MCP stdio toolpack for local LLMs☆22Oct 6, 2025Updated 5 months ago
- Single-file, pure CUDA C implementation for running inference on Qwen3 0.6B GGUF. No Dependencies.☆23Nov 26, 2025Updated 3 months ago
- A miniaturized version of the Kimi-K2 model optimized for deployment on single H100 GPUs.☆36Jul 16, 2025Updated 7 months ago
- Protocol for Augmented Memory of Project Artifacts (MCP compatible) - extended☆24Jan 24, 2026Updated last month
- ☆22Aug 9, 2025Updated 7 months ago
- AI Based "Happiness Optimizer"☆12Oct 20, 2024Updated last year
- LLamaHTML is a simple html file to communicate with a running llamacpp llama-server☆22Aug 5, 2025Updated 7 months ago
- A sleek web interface for Ollama, making local LLM management and usage simple. WebOllama provides an intuitive UI to manage Ollama model…☆71Oct 8, 2025Updated 5 months ago
- ☆38Jan 15, 2025Updated last year
- ☆15Feb 1, 2025Updated last year
- SwiftLet is a lightweight Python framework for running open-source Large Language Models (LLMs) locally using safetensors☆28Aug 6, 2025Updated 7 months ago
- A curated collection of persona-based mcp server & tool groupings.☆36Sep 11, 2025Updated 5 months ago
- High-Performance Text Deduplication Toolkit☆62Aug 25, 2025Updated 6 months ago
- ☆19Jul 4, 2025Updated 8 months ago
- A powerful system for crawling documentation websites, extracting code snippets, and providing fast search capabilities via MCP (Model C…☆27Dec 25, 2025Updated 2 months ago
- Chat WebUI is an easy-to-use user interface for interacting with AI, and it comes with multiple useful built-in tools such as web search …☆51Feb 10, 2026Updated 3 weeks ago
- ☆64Jun 24, 2025Updated 8 months ago
- Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.☆23Sep 1, 2025Updated 6 months ago
- Llama.cui is a small llama.cpp-based chat application for Node.js☆20Jul 10, 2025Updated 8 months ago
- ☆17Jan 27, 2025Updated last year
- Lightweight Llama 3 8B Inference Engine in CUDA C☆53Mar 21, 2025Updated 11 months ago
- ☆23Updated this week
- Metal GPU implementation of the Qwen3 transformer model on macOS with complete Apple Silicon compute shader acceleration.☆42Oct 6, 2025Updated 5 months ago
- Service for testing out the new Qwen2.5 omni model☆63Apr 30, 2025Updated 10 months ago
- ☆23Sep 27, 2024Updated last year
- ☆46Feb 19, 2026Updated 2 weeks ago
- A simple no-install web UI for Ollama and OAI-Compatible APIs!☆31Jan 30, 2025Updated last year
- An API for VoiceCraft.☆25Jun 27, 2024Updated last year
- RP/Writing focused LLM frontend☆69Updated this week
- Neural Audio Codecs implemented in C# - DAC, SNAC, Encodec, Dia☆45Jun 11, 2025Updated 8 months ago
- ☆35Mar 22, 2025Updated 11 months ago
- Another frontend for Ollama☆31Nov 15, 2025Updated 3 months ago
- A persistent local memory for AI, LLMs, or Copilot in VS Code.☆206Feb 24, 2026Updated last week
- GPT-4 Level Conversational QA Trained In a Few Hours☆68Aug 21, 2024Updated last year
- List of general Data Structures in several languages☆26Jun 24, 2020Updated 5 years ago
- This project provides code to accompany the "AI and ML for Web Devs" video series, focusing on teaching AI and ML concepts through hands-…☆36Oct 12, 2025Updated 4 months ago
- Spotlight-like client for Ollama on Windows.☆28May 18, 2024Updated last year
- Text-to-Speech (TTS) engine for the Armenian language☆12Sep 29, 2024Updated last year