A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).
☆163Jan 1, 2025Updated last year
Alternatives and similar repositories for awesome-adaptive-computation
Users that are interested in awesome-adaptive-computation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model☆13Feb 11, 2025Updated last year
- Token-level adaptation of LoRA matrices for downstream task generalization.☆15Apr 14, 2024Updated 2 years ago
- ☆22Aug 27, 2023Updated 2 years ago
- Aioli: A unified optimization framework for language model data mixing☆32Jan 17, 2025Updated last year
- A collection of AWESOME things about mixture-of-experts☆1,280Dec 8, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆166Apr 13, 2025Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Jul 12, 2023Updated 2 years ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Mar 10, 2025Updated last year
- [NeurIPS 2024] Efficiency for Free: Ideal Data Are Transportable Representations☆19Jan 19, 2025Updated last year
- A library for squeakily cleaning and filtering language datasets.☆50Jul 10, 2023Updated 2 years ago
- A curated list for Efficient Large Language Models☆11Mar 25, 2024Updated 2 years ago
- Mamba R1 represents a novel architecture that combines the efficiency of Mamba's state space models with the scalability of Mixture of Ex…☆24Oct 13, 2025Updated 8 months ago
- [ACL 2023 Findings] Emergent Modularity in Pre-trained Transformers☆26Jun 7, 2023Updated 3 years ago
- A curated list of early exiting (LLM, CV, NLP, etc)☆74Aug 21, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers☆48Oct 21, 2022Updated 3 years ago
- ☆27Apr 1, 2026Updated 2 months ago
- GPU operators for sparse tensor operations☆37Mar 11, 2024Updated 2 years ago
- Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)☆12Dec 27, 2021Updated 4 years ago
- Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4☆993Jun 4, 2026Updated 2 weeks ago
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,687Mar 8, 2024Updated 2 years ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆83Sep 10, 2023Updated 2 years ago
- ☆84Mar 11, 2025Updated last year
- ☆21Oct 15, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A bibliography and survey of the papers surrounding o1☆1,213Nov 16, 2024Updated last year
- batched loras☆351Sep 6, 2023Updated 2 years ago
- A curated list of Model Merging methods.☆95Dec 3, 2025Updated 6 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆121May 19, 2026Updated last month
- The code of 《M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image Analysis》☆14Mar 31, 2025Updated last year
- Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.☆12Feb 11, 2024Updated 2 years ago
- CRAI is a multimodal large language model based on the Mixture of Experts (MoE) architecture, supporting text and image cross-modal tasks…☆16Apr 29, 2025Updated last year
- A curated reading list of research in Mixture-of-Experts(MoE).☆667Oct 30, 2024Updated last year
- MoE-Visualizer is a tool designed to visualize the selection of experts in Mixture-of-Experts (MoE) models.☆16Apr 8, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Understand and test language model architectures on synthetic tasks.☆275Mar 22, 2026Updated 2 months ago
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆175Jun 20, 2024Updated last year
- Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…☆21Mar 7, 2024Updated 2 years ago
- ☆415Nov 2, 2023Updated 2 years ago
- ☆20May 28, 2025Updated last year
- Awesome Mixture of Experts (MoE): A Curated List of Mixture of Experts (MoE) and Mixture of Multimodal Experts (MoME)☆65Oct 6, 2025Updated 8 months ago
- Implementation of Image Classification using Visual Transformers in Amazon SageMaker based on the ideas from research paper - Visual Tran…☆18Dec 28, 2020Updated 5 years ago