mcleish7/retrofitting-recurrence

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mcleish7/retrofitting-recurrence)

mcleish7 / retrofitting-recurrence

Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

☆68

Alternatives and similar repositories for retrofitting-recurrence

Users that are interested in retrofitting-recurrence are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

morse-benchmark / morse-500
View on GitHub
☆31May 21, 2026Updated 2 months ago
facebookresearch / scalable-curvature
View on GitHub
Code for Dayal Kalra's research internship on scalable curvature measures for neural networks.
☆29Feb 3, 2026Updated 5 months ago
montehoover / DynaGuard
View on GitHub
Code for "DynaGuard: A Dynamic Guardrail Model With User-Defined Policies."
☆23Nov 3, 2025Updated 8 months ago
mcleish7 / gemstone-scaling-laws
View on GitHub
Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)
☆35Sep 28, 2025Updated 9 months ago
facebookresearch / rl-injector
View on GitHub
Official release of code for the paper RL is a hammer and LLMs are nails A simple RL approach to stronger prompt injection attacks
☆53May 6, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
devnkong / GOAT
View on GitHub
Official implementation of GOAT model (ICML2023)
☆38Jul 3, 2023Updated 3 years ago
hamidkazemi22 / CLIPInversion
View on GitHub
What do we learn from inverting CLIP models?
☆58Mar 6, 2024Updated 2 years ago
ahans30 / goldfish-loss
View on GitHub
[NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs
☆98Nov 17, 2024Updated last year
facebookresearch / zero
View on GitHub
PyTorch Implementation of Zero-Shot Vision Encoder Grafting via LLM Surrogates [ICCV'25]
☆54Jul 10, 2025Updated last year
thu-nics / TaH
View on GitHub
[ICML'26] Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"
☆75Jul 17, 2026Updated last week
mcleish7 / arithmetic
View on GitHub
Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)
☆200May 28, 2024Updated 2 years ago
YuxinWenRick / diffusion_memorization
View on GitHub
Official repo for Detecting, Explaining, and Mitigating Memorization in Diffusion Models (ICLR 2024)
☆80Apr 3, 2024Updated 2 years ago
j-alex-hanson / gaussian-splatting-pup
View on GitHub
☆145Nov 22, 2025Updated 8 months ago
sandyresearch / parcae
View on GitHub
Stable Looped Models and their Scaling Laws
☆171May 17, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
belindal / state-tracking
View on GitHub
Code and data for paper "(How) do Language Models Track State?"
☆26Mar 31, 2025Updated last year
seal-rg / recurrent-pretraining
View on GitHub
Pretraining and inference code for a large-scale depth-recurrent language model
☆901Dec 29, 2025Updated 6 months ago
juzhengz / logit-fusion
View on GitHub
Learning from Mixed Rollouts: Logit Fusion as a Bridge Between Imitation and Exploration
☆17Feb 24, 2026Updated 5 months ago
allenai / signal-and-noise
View on GitHub
Measuring the Signal to Noise Ratio in Language Model Evaluation
☆31Aug 19, 2025Updated 11 months ago
neelsjain / baseline-defenses
View on GitHub
Official Code for "Baseline Defenses for Adversarial Attacks Against Aligned Language Models"
☆34Oct 26, 2023Updated 2 years ago
JonasGeiping / dataaugs
View on GitHub
☆18Oct 12, 2022Updated 3 years ago
JasonForJoy / BRIEF
View on GitHub
ACL 2026 & NAACL 2025: Bridging Retrieval and Inference through Evidence Fusion
☆14Apr 9, 2026Updated 3 months ago
GBATZOLIS / BitstreamDiffusion
View on GitHub
☆15Updated this week
tuallen / speede3dgs
View on GitHub
☆109Jun 8, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
neelsjain / BYOD
View on GitHub
The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"
☆108Sep 23, 2023Updated 2 years ago
LUMIA-Group / PonderingLM
View on GitHub
Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"
☆26Jul 21, 2025Updated last year
wenquanlu / huginn-latent-cot
View on GitHub
[COLM 2025: 1st Workshop on the Application of LLM Explainability to Reasoning and Planning] Latent Chain-of-Thought? Decoding the Depth-…
☆20Oct 4, 2025Updated 9 months ago
dame-cell / Triformer
View on GitHub
Transformers components but in Triton
☆34May 9, 2025Updated last year
facebookresearch / prompt-siren
View on GitHub
A research workbench for developing and testing attacks against large language models, with a focus on prompt injection vulnerabilities a…
☆55Jul 16, 2026Updated last week
YuxinWenRick / canary-in-a-coalmine
View on GitHub
☆33Nov 27, 2023Updated 2 years ago
facebookresearch / PhysicsLM4
View on GitHub
Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Reality
☆356May 20, 2026Updated 2 months ago
FFTYYY / mhc-lite
View on GitHub
mHC-lite: You Don’t Need 20 Sinkhorn-Knopp Iterations
☆91Jan 12, 2026Updated 6 months ago
ElvishElvis / LCA-on-the-line
View on GitHub
LCA-on-the-line (ICML 2024 Oral)
☆14Feb 13, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Ber666 / reasoning-by-superposition
View on GitHub
Official implementation of "Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought" (NeurIPS 2025)
☆44Oct 8, 2025Updated 9 months ago
Leiay / looped_transformer
View on GitHub
☆51Dec 12, 2023Updated 2 years ago
azshue / AutoPoison
View on GitHub
The official repository of the paper "On the Exploitability of Instruction Tuning".
☆70Feb 5, 2024Updated 2 years ago
SourceShift / book-trm
View on GitHub
Building Tiny Recursive Models from Scratch
☆15Oct 9, 2025Updated 9 months ago
hpcgroup / loki
View on GitHub
Algorithms for approximate attention in LLMs
☆22Apr 14, 2025Updated last year
Yifei-Zuo / Parallax
View on GitHub
Official repository for Parallax (Parameterized Local Linear Attention)
☆65Jul 7, 2026Updated 2 weeks ago
LeonLixyz / LCLM
View on GitHub
latent context language models
☆72Jun 9, 2026Updated last month