woct0rdho / transformers-qwen3-moe-fusedLinks
Fused Qwen3 MoE layer for faster training, compatible with Transformers, LoRA, bnb 4-bit quant, Unsloth. Also possible to train LoRA over GGUF
☆229Updated last week
Alternatives and similar repositories for transformers-qwen3-moe-fused
Users that are interested in transformers-qwen3-moe-fused are comparing it to the libraries listed below
Sorting:
- Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs☆203Updated 2 months ago
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆120Updated 8 months ago
- A repository aimed at pruning DeepSeek V3, R1 and R1-zero to a usable size☆82Updated 5 months ago
- ☆92Updated 8 months ago
- ☆82Updated 10 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆139Updated last year
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆468Updated 8 months ago
- A collection of tricks and tools to speed up transformer models☆194Updated last month
- ☆64Updated 8 months ago
- ☆520Updated last month
- Nano repo for RL training of LLMs☆70Updated 3 months ago
- Block Diffusion for Ultra-Fast Speculative Decoding☆459Updated this week
- ☆66Updated 10 months ago
- A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.☆222Updated 6 months ago
- Cookbook of SGLang - Recipe☆65Updated this week
- minimal GRPO implementation from scratch☆102Updated 10 months ago
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆227Updated 3 months ago
- CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…☆225Updated 3 weeks ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆110Updated 8 months ago
- Tina: Tiny Reasoning Models via LoRA☆316Updated 4 months ago
- Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI.☆253Updated 4 months ago
- ☆74Updated 8 months ago
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.☆277Updated 3 months ago
- Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.☆89Updated 3 months ago
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆150Updated last year
- A pipeline for LLM knowledge distillation☆112Updated 10 months ago
- ☆148Updated last year
- [NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆190Updated 6 months ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆97Updated 8 months ago
- MrlX: A Multi-Agent Reinforcement Learning Framework☆189Updated 2 weeks ago