Leiay/looped_transformer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Leiay/looped_transformer)

Leiay / looped_transformer

☆52

Alternatives and similar repositories for looped_transformer

Users that are interested in looped_transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

chijames / KERPLE
View on GitHub
☆20Oct 25, 2022Updated 3 years ago
HanseulJo / position-coupling
View on GitHub
Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure (NeurIPS 2024) + Arithmetic Transfor…
☆14Oct 26, 2025Updated 9 months ago
watcl-lab / positional_attention
View on GitHub
Source code for the paper "Positional Attention: Expressivity and Learnability of Algorithmic Computation"
☆14May 26, 2025Updated last year
locuslab / get
View on GitHub
Generative Equilibrium Transformer
☆28Nov 11, 2023Updated 2 years ago
nikhilvyas / SOAP_MUON
View on GitHub
Combining SOAP and MUON
☆25Feb 11, 2025Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
mcleish7 / retrofitting-recurrence
View on GitHub
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
☆68Nov 11, 2025Updated 8 months ago
sanyalsunny111 / Looped-GPT
View on GitHub
Minimal and highly hackable implementation of Looped Transformers with GPT
☆25Mar 8, 2026Updated 4 months ago
Zcchill / Value-Residual-Learning
View on GitHub
☆15Mar 20, 2025Updated last year
FranxYao / Complexity-Based-Prompting
View on GitHub
Complexity Based Prompting for Multi-Step Reasoning
☆17Mar 10, 2023Updated 3 years ago
XuezheMax / gecko-llm
View on GitHub
Gecko Architecture
☆16Jan 13, 2026Updated 6 months ago
smonsays / contrastive-meta-learning
View on GitHub
Code accompanying the paper "A contrastive rule for meta-learning"
☆13Oct 31, 2024Updated last year
gnobitab / ConstrainedDiffusionBridge
View on GitHub
☆14Mar 16, 2023Updated 3 years ago
TrentBrick / SDMContinualLearner
View on GitHub
☆21Mar 1, 2023Updated 3 years ago
corl-team / rebased
View on GitHub
Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"
☆169Jan 16, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
hamzakeurti / homomorphismvae
View on GitHub
☆13Sep 18, 2024Updated last year
RobertCsordas / moeut
View on GitHub
☆93Aug 18, 2024Updated last year
shawntan / stickbreaking-attention
View on GitHub
Stick-breaking attention
☆63Jul 1, 2025Updated last year
eamartin / parallelizing_linear_rnns
View on GitHub
☆45Apr 30, 2018Updated 8 years ago
maximzubkov / fft-scan
View on GitHub
Efficient PScan implementation in PyTorch
☆17Jan 2, 2024Updated 2 years ago
gd-zhang / Follow-the-Ridge
View on GitHub
Minimax Optimization, Stackelberg Games, Generative Adversarial Networks
☆19Feb 14, 2020Updated 6 years ago
EkdeepSLubana / BeyondBatchNorm
View on GitHub
Codebase for the paper "Beyond BatchNorm: Towards a Unified Understanding of Normalization in Deep Learning"
☆17Jul 12, 2021Updated 5 years ago
iankur / vqllm
View on GitHub
Residual vector quantization for KV cache compression in large language model
☆12Oct 22, 2024Updated last year
effl-lab / Fast-Neural-Fields
View on GitHub
Research Papers on Efficient Neural Fields from EffL Group
☆16Apr 21, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
shreyansh26 / An-Empirical-Model-of-Large-Batch-Training
View on GitHub
An approximate implementation of the OpenAI paper - An Empirical Model of Large-Batch Training for MNIST
☆11Nov 19, 2022Updated 3 years ago
nouhadziri / faith-and-fate
View on GitHub
☆39Mar 29, 2024Updated 2 years ago
aviclu / ffn-values
View on GitHub
☆67May 18, 2023Updated 3 years ago
AntiBargu / RUC-YOJ
View on GitHub
中国人民大学 YOJ 题库
☆12Jun 9, 2022Updated 4 years ago
JinjieNi / Quokka
View on GitHub
The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models sca…
☆46Nov 6, 2025Updated 8 months ago
microsoft / DGT
View on GitHub
Learning Accurate Decision Trees with Bandit Feedback via Quantized Gradient Descent
☆16Sep 8, 2022Updated 3 years ago
kazuki-irie / kv-memory-brain
View on GitHub
Official Code Repository for the paper "Key-value memory in the brain"
☆32Feb 25, 2025Updated last year
google-deepmind / neural_networks_solomonoff_induction
View on GitHub
Learning Universal Predictors
☆84Aug 1, 2024Updated last year
rpatrik96 / ima-vae
View on GitHub
This is the code for the paper Embrace the Gap: VAEs perform Independent Mechanism Analysis, showing that optimizing the ELBO is equivale…
☆22Apr 22, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
gravesee / rulefit
View on GitHub
Fit Lasso model to binary rules created from tree ensembles
☆12Aug 2, 2017Updated 8 years ago
lucidrains / hyper-connections
View on GitHub
Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public
☆187May 13, 2026Updated 2 months ago
LCS2-IIITD / DaSLaM
View on GitHub
☆17Oct 31, 2023Updated 2 years ago
luli-git / MAP
View on GitHub
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
☆18Sep 2, 2024Updated last year
med-air / AI-Endo
View on GitHub
Code repository of AI-Endo
☆16Jan 16, 2024Updated 2 years ago
Keely-Ai / F2D2
View on GitHub
Joint Distillation for Fast Likelihood Evaluation and Sampling in Flow-based Models
☆22Mar 5, 2026Updated 4 months ago
ejmichaud / grokking-squared
View on GitHub
☆28Feb 1, 2023Updated 3 years ago