REAP: Router-weighted Expert Activation Pruning for SMoE compression
☆347Apr 17, 2026Updated 2 weeks ago
Alternatives and similar repositories for reap
Users that are interested in reap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆809Updated this week
- ☆21Apr 2, 2025Updated last year
- A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp☆25Updated this week
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆30Jun 30, 2025Updated 10 months ago
- Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs☆23Nov 11, 2025Updated 5 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A comprehensive and efficient long-context model evaluation framework☆31Feb 25, 2026Updated 2 months ago
- Standalone repo for our Atropos integration with Thinking Machines Tinker API (https://thinkingmachines.ai/tinker/)☆64Mar 22, 2026Updated last month
- Model souping for LLMs☆73Nov 18, 2025Updated 5 months ago
- Mini Model Daemon☆13Nov 9, 2024Updated last year
- Official repository of Sparse ISO-FLOP Transformations for Maximizing Training Efficiency☆25Jul 31, 2024Updated last year
- Reproducing RigL (ICML 2020) as a part of ML Reproducibility Challenge 2020☆29Jan 6, 2022Updated 4 years ago
- LLMProxy is an intelligent large language model backend routing proxy service.☆24Dec 6, 2025Updated 4 months ago
- Use winsqlite3.dll (the SQLite DLL that ships with Windows 10) in PowerShell☆13Jan 12, 2025Updated last year
- Neural Homomorphic Vocoder optimized for singing voice synthesis☆29Mar 20, 2026Updated last month
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A lightweight adjustment tool for smoothing token probabilities in the Qwen models to encourage balanced multilingual generation.☆105Jul 9, 2025Updated 9 months ago
- ☆10Mar 8, 2025Updated last year
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆14Mar 17, 2025Updated last year
- NextCoder: Robust Adaptation of Code LMs to Diverse Code Edits (ICML'25)☆44Jul 9, 2025Updated 9 months ago
- 🌳 MCTS-inspired parallel beam search for conversation optimization. Explore multiple dialogue strategies simultaneously, stress-test a…☆36Jan 18, 2026Updated 3 months ago
- [ICLR 2026 🔥] Dr.LLM: Dynamic Layer Routing in LLMs☆45Apr 21, 2026Updated last week
- A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support…☆1,068Updated this week
- ☆10Mar 20, 2024Updated 2 years ago
- Kanade is a single-layer disentangled speech tokenizer that extracts compact tokens suitable for both generative and discriminative model…☆93Apr 3, 2026Updated 3 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Exploring how optimizations for GEMMs work☆31Feb 28, 2026Updated 2 months ago
- Simplifying cheque processing for banks using Transformers☆17Dec 15, 2022Updated 3 years ago
- The Active Reliability Layer for AI Agents. Catch failures, teach fixes, and automate reliability☆132Jan 19, 2026Updated 3 months ago
- Direct Preference Optimization for RWKV, aiming for RWKV-5 and 6.☆11Mar 1, 2024Updated 2 years ago
- An fully autonomous agent that accesses the browser and performs tasks.☆18Apr 25, 2025Updated last year
- ☆12Dec 21, 2024Updated last year
- Official Implementation for NorMuon paper☆66Mar 11, 2026Updated last month
- [ICLR 2026] RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling☆37Feb 25, 2026Updated 2 months ago
- Your AI Soul Companion. Self-hosted AI agent across 30+ messaging channels It can not only serve as an emotional companion in daily life …☆45Apr 10, 2026Updated 3 weeks ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Memory Agent monorepo☆86Oct 9, 2025Updated 6 months ago
- Personal voice assistant, with voice interruption and Twilio support☆18Feb 24, 2025Updated last year
- A 20M RWKV v6 can do nonogram☆13Oct 18, 2024Updated last year
- Official Chinese documentation for RWKV | RWKV官方中文文档☆15Apr 16, 2026Updated 2 weeks ago
- A Knowledge-grounded framework for Autonomous ML/AI Program Synthesis and Optimization☆90Feb 20, 2026Updated 2 months ago
- ☆19Apr 18, 2025Updated last year
- This project aims to provide a quick and efficient way to capture any thought to your AnyType second brain. It leverages the protobuf GRP…☆15Aug 26, 2024Updated last year