A repository aimed at pruning DeepSeek V3, R1 and R1-zero to a usable size
☆87Sep 5, 2025Updated 9 months ago
Alternatives and similar repositories for moe-pruner
Users that are interested in moe-pruner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Direct Preference Optimization for RWKV, aiming for RWKV-5 and 6.☆11Mar 1, 2024Updated 2 years ago
- ☆12Dec 21, 2024Updated last year
- This is an Android App. Now with 100% less bugs.☆10Sep 26, 2019Updated 6 years ago
- Official Implementation for NorMuon paper☆81Apr 30, 2026Updated 2 months ago
- [ICLR25] STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs☆20Jun 3, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆41Apr 30, 2025Updated last year
- ☆17Jan 1, 2025Updated last year
- Mini Model Daemon☆13Nov 9, 2024Updated last year
- D^2-MoE: Delta Decompression for MoE-based LLMs Compression☆82Mar 25, 2025Updated last year
- ☆29Aug 27, 2025Updated 10 months ago
- continous batching and parallel acceleration for RWKV6☆22Jun 28, 2024Updated 2 years ago
- A toy text-to-image model trained from scratch.☆19Jun 9, 2025Updated last year
- ☆17Nov 23, 2023Updated 2 years ago
- Official Chinese documentation for RWKV | RWKV官方中文文档☆14Jun 10, 2026Updated 3 weeks ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- MiSS is a novel PEFT method that features a low-rank structure but introduces a new update mechanism distinct from LoRA, achieving an exc…☆35Mar 9, 2026Updated 3 months ago
- Lottery Ticket Adaptation☆40Nov 20, 2024Updated last year
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28May 4, 2025Updated last year
- "Robust Attributed Graph Alignment via Joint Structure Learning and Optimal Transport" in ICDE 2023☆18Oct 23, 2023Updated 2 years ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- The application of large pre-trained vision model DINOv2 from MetaAI for feature points matching, and a ViT decoder used for Auto Encoder☆18Apr 27, 2023Updated 3 years ago
- Language modeling with linear-cost context☆118Sep 25, 2025Updated 9 months ago
- Demonstration of a factory pattern where the types automatically register themselves☆13Mar 13, 2019Updated 7 years ago
- ☆18Sep 29, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Flash-Muon: An Efficient Implementation of Muon Optimizer☆257Jun 15, 2025Updated last year
- Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models☆266Apr 23, 2024Updated 2 years ago
- Course Project for COMP4471 on RWKV☆17Feb 11, 2024Updated 2 years ago
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"☆22Oct 14, 2025Updated 8 months ago
- A program that allows you to chat on VRChat using ChatGPT.☆15Mar 22, 2023Updated 3 years ago
- langchain opentutorial utility package for Open Tutorial☆10Feb 2, 2025Updated last year
- Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton☆49Apr 2, 2026Updated 3 months ago
- The WorldRWKV project aims to implement training and inference across various modalities using the RWKV7 architecture. By leveraging diff…☆70Mar 18, 2026Updated 3 months ago
- ☆22Nov 26, 2025Updated 7 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- rule matcher (context free grammar)☆10Dec 27, 2019Updated 6 years ago
- Awesome Entity Alignment is a collection of EA techniques, including papers, codes, and datasets.☆11Oct 27, 2022Updated 3 years ago
- Control LLM☆23Apr 6, 2025Updated last year
- [ACL 2026 Main] Analytical FFN-to-MoE Restructuring via Activation Pattern Analysis☆45Updated this week
- This repo is re-produce for Channel_pruning☆11May 17, 2018Updated 8 years ago
- ☆12Jul 7, 2022Updated 3 years ago
- [ICML 2025] No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces (official repository)☆45Aug 7, 2025Updated 10 months ago