CerebrasResearch / reapView external linksLinks
REAP: Router-weighted Expert Activation Pruning for SMoE compression
☆232Dec 9, 2025Updated 2 months ago
Alternatives and similar repositories for reap
Users that are interested in reap are comparing it to the libraries listed below
Sorting:
- ☆18Dec 9, 2025Updated 2 months ago
- Get aid from local LLMs right in your PowerShell☆15May 2, 2025Updated 9 months ago
- A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp☆16Updated this week
- A Prompt Enhancer for flux.1 in ComfyUI☆12Jan 11, 2026Updated last month
- 🌳 MCTS-inspired parallel beam search for conversation optimization. Explore multiple dialogue strategies simultaneously, stress-test a…☆35Jan 18, 2026Updated 3 weeks ago
- Official and Third Party Plugins & Themes for DankMaterialShell☆34Updated this week
- 🔍📃 LLM-powered PDF Table Extractor☆19Jun 26, 2025Updated 7 months ago
- LLMProxy is an intelligent large language model backend routing proxy service.☆22Dec 6, 2025Updated 2 months ago
- Simple node proxy for llama-server that enables MCP use☆17May 10, 2025Updated 9 months ago
- Extension for AUTOMATIC1111/stable-diffusion-webui for pasting images from clipboard in any WebUI form.☆16Nov 22, 2023Updated 2 years ago
- JotItNow is a AI Voice Notes App☆24Mar 6, 2025Updated 11 months ago
- OpenAPI specifications => MCP (Model Context Protocol) tools☆19Dec 9, 2024Updated last year
- (ICLR'26 + Netflix) Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning☆37Nov 17, 2025Updated 2 months ago
- NextCoder: Robust Adaptation of Code LMs to Diverse Code Edits (ICML'25)☆42Jul 9, 2025Updated 7 months ago
- An fully autonomous agent that accesses the browser and performs tasks.☆17Apr 25, 2025Updated 9 months ago
- Personal voice assistant, with voice interruption and Twilio support☆18Feb 24, 2025Updated 11 months ago
- A tool for adding function calling to llm api, available as a service by following the link☆22Aug 11, 2025Updated 6 months ago
- A backup of SmokelessRuntimeEFIPatcher☆27Jun 19, 2024Updated last year
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆30May 18, 2025Updated 8 months ago
- A lightweight adjustment tool for smoothing token probabilities in the Qwen models to encourage balanced multilingual generation.☆104Jul 9, 2025Updated 7 months ago
- RADLADS training code☆36May 7, 2025Updated 9 months ago
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆26Mar 28, 2025Updated 10 months ago
- ☆24Feb 9, 2025Updated last year
- RePlan: Reasoning-Guided Region Planning for Complex Instruction-Based Image Editing☆59Dec 26, 2025Updated last month
- A pipeline parallel training script for LLMs.☆166Apr 30, 2025Updated 9 months ago
- 🎯An accuracy-first, highly efficient quantization toolkit for LLMs, designed to minimize quality degradation across Weight-Only Quantiza…☆845Feb 6, 2026Updated last week
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆29Jun 30, 2025Updated 7 months ago
- Create text chunks which end at natural stopping points without using a tokenizer☆26Nov 26, 2025Updated 2 months ago
- (ICLR 2026) Unveiling Super Experts in Mixture-of-Experts Large Language Models☆36Sep 25, 2025Updated 4 months ago
- Moondream MCP Server in Python☆45Jul 2, 2025Updated 7 months ago
- ☆112Jun 19, 2025Updated 7 months ago
- Open source tool for transcirption and subtitling, alternative to happyscribe.☆33Feb 12, 2025Updated last year
- Genertaes control vectors for use with llama.cpp in GGUF format.☆38Mar 19, 2025Updated 10 months ago
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆29Mar 15, 2025Updated 10 months ago
- DFloat11 [NeurIPS '25]: Lossless Compression of LLMs and DiTs for Efficient GPU Inference☆603Nov 24, 2025Updated 2 months ago
- InSales e-commerce platform API bindings☆14Jul 13, 2024Updated last year
- Multi-step AI agents powered by Gemini 2.0 and the LangGraph framework. These agents orchestrate complex workflows and enhance their reas…☆10Dec 19, 2024Updated last year
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆32Nov 4, 2024Updated last year
- 🚀 FlexLLama - Lightweight self-hosted tool for running multiple llama.cpp server instances with OpenAI v1 API compatibility and multi-GP…☆50Nov 26, 2025Updated 2 months ago