A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).
☆159Jan 1, 2025Updated last year
Alternatives and similar repositories for awesome-adaptive-computation
Users that are interested in awesome-adaptive-computation are comparing it to the libraries listed below
Sorting:
- ☆22Aug 27, 2023Updated 2 years ago
- Token-level adaptation of LoRA matrices for downstream task generalization.☆15Apr 14, 2024Updated last year
- A library for squeakily cleaning and filtering language datasets.☆50Jul 10, 2023Updated 2 years ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆163Apr 13, 2025Updated 10 months ago
- Aioli: A unified optimization framework for language model data mixing☆32Jan 17, 2025Updated last year
- AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers☆48Oct 21, 2022Updated 3 years ago
- https://nnsmith-asplos.rtfd.io Artifact of "NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers" ASPLOS'23☆11Mar 29, 2023Updated 2 years ago
- LLM as World Models using Bayesian inference☆16May 27, 2025Updated 9 months ago
- A curated list of projects and resources using BAML☆17Aug 1, 2025Updated 7 months ago
- A curated list for Efficient Large Language Models☆11Mar 25, 2024Updated last year
- A library for simplifying training with multi gpu setups in the HuggingFace / PyTorch ecosystem.☆16Jan 9, 2026Updated 2 months ago
- Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.☆12Feb 11, 2024Updated 2 years ago
- Implementation of the Pairformer model used in AlphaFold 3☆14Mar 2, 2026Updated last week
- A collection of AWESOME things about mixture-of-experts☆1,269Dec 8, 2024Updated last year
- [ACL 2023 Findings] Emergent Modularity in Pre-trained Transformers☆26Jun 7, 2023Updated 2 years ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Jul 12, 2023Updated 2 years ago
- EfficientSAM + YOLO World base model for use with Autodistill.☆10Feb 21, 2024Updated 2 years ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆115Mar 3, 2026Updated last week
- Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)☆12Dec 27, 2021Updated 4 years ago
- ☆29Jan 23, 2024Updated 2 years ago
- batched loras☆350Sep 6, 2023Updated 2 years ago
- Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4☆973Updated this week
- ☆274Oct 31, 2023Updated 2 years ago
- A plugin to use a language model to fill in parts of notes.☆16Feb 20, 2024Updated 2 years ago
- [NeurIPS 2025 Spotlight] Implementation of "KLASS: KL-Guided Fast Inference in Masked Diffusion Models"☆23Jan 3, 2026Updated 2 months ago
- ☆95Jul 26, 2023Updated 2 years ago
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Nov 29, 2023Updated 2 years ago
- GPU operators for sparse tensor operations☆35Mar 11, 2024Updated last year
- ☆415Nov 2, 2023Updated 2 years ago
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,664Mar 8, 2024Updated 2 years ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Mar 10, 2025Updated 11 months ago
- Official PyTorch implementation of our ICCV2023 paper “When Prompt-based Incremental Learning Does Not Meet Strong Pretraining”☆16Jan 8, 2024Updated 2 years ago
- Examples for running TeNPy☆16Oct 31, 2025Updated 4 months ago
- Friends of OLMo and their links.☆358Sep 15, 2025Updated 5 months ago
- 2D road segmentation using lidar data during training☆43Dec 21, 2023Updated 2 years ago
- 🏥 Health monitor for a Petals swarm☆40Jul 24, 2024Updated last year
- 🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation…☆129Nov 10, 2025Updated 3 months ago
- Training hybrid models for dummies.☆29Nov 1, 2025Updated 4 months ago
- A bibliography and survey of the papers surrounding o1☆1,212Nov 16, 2024Updated last year