adriancable / qwen3.cLinks
Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.
☆83Updated last week
Alternatives and similar repositories for qwen3.c
Users that are interested in qwen3.c are comparing it to the libraries listed below
Sorting:
- ☆131Updated 2 months ago
- Guaranteed Structured Output from any Language Model via Hierarchical State Machines☆140Updated last month
- ☆79Updated this week
- InferX is a Inference Function as a Service Platform☆115Updated 2 weeks ago
- Sparse Inferencing for transformer based LLMs☆193Updated this week
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆67Updated 2 weeks ago
- ☆95Updated 6 months ago
- automatically quant GGUF models☆185Updated last week
- Distributed Inference for mlx LLm☆93Updated 11 months ago
- Easily view and modify JSON datasets for large language models☆77Updated last month
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆78Updated 9 months ago
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆156Updated last year
- 1.58-bit LLaMa model☆81Updated last year
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆89Updated 2 weeks ago
- AI management tool☆118Updated 8 months ago
- Open source LLM UI, compatible with all local LLM providers.☆175Updated 9 months ago
- ☆307Updated 3 months ago
- Serving LLMs in the HF-Transformers format via a PyFlask API☆71Updated 10 months ago
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆257Updated 4 months ago
- ☆204Updated last month
- A fast batching API to serve LLM models☆183Updated last year
- Fast parallel LLM inference for MLX☆198Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆173Updated last year
- ☆80Updated 4 months ago
- ☆28Updated last month
- CaSIL is an advanced natural language processing system that implements a sophisticated four-layer semantic analysis architecture. It pro…☆66Updated 8 months ago
- ☆49Updated 4 months ago
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆430Updated this week
- ☆131Updated 2 months ago
- A pipeline parallel training script for LLMs.☆152Updated 2 months ago