koayon / awesome-adaptive-computationLinks

A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).

☆158

Alternatives and similar repositories for awesome-adaptive-computation

Users that are interested in awesome-adaptive-computation are comparing it to the libraries listed below

Sorting:

lucidrains / coconut-pytorch
Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch
☆180Updated 5 months ago
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆240Updated last month
jxiw / MambaInLlama
[NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models
☆231Updated last month
lucidrains / PEER-pytorch
Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind
☆131Updated 3 weeks ago
minyoungg / LTE
☆69Updated last year
mcleish7 / arithmetic
Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)
☆194Updated last year
FasterDecoding / BitDelta
☆203Updated 11 months ago
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆173Updated 4 months ago
HazyResearch / based
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
☆243Updated 5 months ago
CASE-Lab-UMD / LLM-Drop
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
☆180Updated last week
IBM / ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…
☆225Updated 2 months ago
ScalingIntelligence / large_language_monkeys
☆108Updated last year
jzhang38 / LongMamba
Some preliminary explorations of Mamba's context scaling.
☆217Updated last year
lucidrains / CALM-pytorch
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
☆177Updated last year
RobertCsordas / moeut
☆88Updated last year
itsnamgyu / block-transformer
Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)
☆162Updated 7 months ago
JacobPfau / fillerTokens
☆75Updated last year
mengxiayu / LLMSuperWeight
Code for studying the super weight in LLM
☆120Updated 11 months ago
architsharma97 / dpo-rlaif
☆100Updated last year
dmis-lab / Monet
[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
☆73Updated 4 months ago
OSU-NLP-Group / GrokkedTransformer
Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
☆233Updated 4 months ago
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆100Updated last year
HanGuo97 / lq-lora
☆128Updated last year
llm-random / llm-random
☆204Updated last week
JoshEngels / MultiDimensionalFeatures
Code for reproducing our paper "Not All Language Model Features Are Linear"
☆84Updated 11 months ago
locuslab / massive-activations
Code accompanying the paper "Massive Activations in Large Language Models"
☆186Updated last year
google-deepmind / asyncdiloco
☆47Updated last year
kyleliang919 / Online-Subspace-Descent
[NeurIPS 2024] Low rank memory efficient optimizer without SVD
☆30Updated 4 months ago
Luckfort / CD
[COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?
☆82Updated 10 months ago
BorealisAI / flora-opt
This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.
☆104Updated last year