Emericen/tiny-qwen

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Emericen/tiny-qwen)

Emericen / tiny-qwen

A minimal PyTorch re-implementation of Qwen 3.5

☆430

Alternatives and similar repositories for tiny-qwen

Users that are interested in tiny-qwen are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Emericen / tiny-deepseek-r1
View on GitHub
DeepSeek R1 distilled into smaller OSS models for hobbyist
☆17Dec 2, 2025Updated 7 months ago
TiledTensor / TiledBench
View on GitHub
Benchmark tests supporting the TiledCUDA library.
☆19Nov 19, 2024Updated last year
GeeeekExplorer / nano-vllm
View on GitHub
Nano vLLM
☆14,557Apr 26, 2026Updated 2 months ago
huggingface / nanoVLM
View on GitHub
The simplest, fastest repository for training/finetuning small-sized VLMs.
☆4,957Oct 27, 2025Updated 8 months ago
wangzhaode / mnn-asr
View on GitHub
mnn asr demo.
☆27Mar 24, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
OpenBMB / infllmv2_cuda_impl
View on GitHub
☆102Feb 11, 2026Updated 5 months ago
yassa9 / qwen600
View on GitHub
Static suckless single batch CUDA-only qwen3-0.6B mini inference engine
☆557Sep 8, 2025Updated 10 months ago
NVlabs / Long-RL
View on GitHub
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
☆726Sep 24, 2025Updated 9 months ago
adriancable / qwen3.c
View on GitHub
Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.
☆183Jul 5, 2025Updated last year
hiyouga / EasyR1
View on GitHub
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
☆5,071Updated this week
phonism / genesis
View on GitHub
Gensis is a lightweight deep learning framework written from scratch in Python, with Triton as its backend for high-performance computing…
☆35Jan 15, 2026Updated 6 months ago
mdy666 / Qwen-Native-Sparse-Attention
View on GitHub
qwen-nsa
☆87Oct 14, 2025Updated 9 months ago
THUDM / slime
View on GitHub
slime is an LLM post-training framework for RL Scaling.
☆7,551Updated this week
ByteDance-Seed / cudaLLM
View on GitHub
☆148Aug 18, 2025Updated 11 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
azuresky03 / distill_wan2.1
View on GitHub
☆26May 30, 2025Updated last year
RiseAI-Sys / DAX
View on GitHub
High performance inference engine for diffusion models
☆107Sep 5, 2025Updated 10 months ago
zhuzilin / flash-attention-with-sink
View on GitHub
☆37Aug 7, 2025Updated 11 months ago
2U1 / Qwen-VL-Series-Finetune
View on GitHub
An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.
☆1,938Updated this week
fla-org / flash-linear-attention
View on GitHub
🚀 Efficient implementations for emerging model architectures
☆5,379Updated this week
alientony / Split-brain
View on GitHub
This is a training method to produce a split brain model
☆14Mar 7, 2025Updated last year
QwenLM / Qwen3-VL
View on GitHub
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
☆19,630Jan 30, 2026Updated 5 months ago
ShaohonChen / Qwen3-SmVL
View on GitHub
将SmolVLM2的视觉头与Qwen3-0.6B模型进行了拼接微调
☆602Sep 8, 2025Updated 10 months ago
QuwsarOhi / NanoAgent
View on GitHub
An agent that can run everywhere - even in your watch!
☆34Apr 8, 2026Updated 3 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
feifeibear / ChituAttention
View on GitHub
Quantized Attention on GPU
☆45Nov 22, 2024Updated last year
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,571Updated this week
yifan123 / flow_grpo
View on GitHub
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
☆2,420May 7, 2026Updated 2 months ago
facebookresearch / tuna-2
View on GitHub
Official implementation of Tuna-2: Pixel Embeddings Beat Vision Encoders for Unified Understanding and Generation
☆738Updated this week
Abinesh-Mathivanan / beens-minimax
View on GitHub
world's stupidest moe llm in 103M parameters
☆20Jul 18, 2025Updated last year
Jeremy2001-chen / OS-RISCV
View on GitHub
A small RISC-V kernel coding by C, tested on sifive unmatched board.
☆16Aug 20, 2022Updated 3 years ago
modelscope / ms-swift
View on GitHub
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL…
☆14,866Updated this week
AndreSlavescu / mHC.cu
View on GitHub
mHC kernels implemented in CUDA
☆264Mar 9, 2026Updated 4 months ago
facebookexperimental / CUTracer
View on GitHub
A dynamic binary instrumentation tool for tracing and analyzing CUDA kernel instructions.
☆72Updated this week
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
ShivamDuggal4 / karl
View on GitHub
Single-pass Adaptive Image Tokenization for Minimum Program Search | What's the Kolmogorov Complexity of an Image?
☆43Jul 26, 2025Updated 11 months ago
MoonshotAI / Kimi-VL
View on GitHub
Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities
☆1,204Jul 15, 2025Updated last year
BKHMSI / llm-localization
View on GitHub
Repository for "The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units" Paper
☆21Nov 18, 2025Updated 8 months ago
radixark / miles
View on GitHub
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
☆1,759Updated this week
tercumantanumut / GameCompanionAI
View on GitHub
Game Companion AI is an advanced application designed to enhance the gaming experience by providing real-time analysis and interpretation…
☆55Sep 30, 2024Updated last year
xlite-dev / qwen-image-fast
View on GitHub
⚡️Qwen-Image 4.8x🎉 speedup with Hybrid Acceleration for low VRAM GPUs
☆17Oct 24, 2025Updated 8 months ago
xlite-dev / LeetCUDA
View on GitHub
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
☆11,578Updated this week