SakanaAI / continuous-thought-machinesView external linksLinks
Continuous Thought Machines, because thought takes time and reasoning is a process.
☆1,761Dec 29, 2025Updated last month
Alternatives and similar repositories for continuous-thought-machines
Users that are interested in continuous-thought-machines are comparing it to the libraries listed below
Sorting:
- Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents☆1,819Aug 13, 2025Updated 6 months ago
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,187Jan 30, 2025Updated last year
- Code for BLT research paper☆2,028Nov 3, 2025Updated 3 months ago
- Pretraining and inference code for a large-scale depth-recurrent language model☆863Dec 29, 2025Updated last month
- The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬☆12,048Dec 19, 2025Updated last month
- Open-source implementation of AlphaEvolve☆5,369Feb 4, 2026Updated last week
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆371Dec 12, 2024Updated last year
- Training Large Language Model to Reason in a Continuous Latent Space☆1,496Aug 12, 2025Updated 6 months ago
- The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search☆2,084Dec 19, 2025Updated last month
- Large Concept Models: Language modeling in a sentence representation space☆2,333Jan 29, 2025Updated last year
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆347Oct 22, 2024Updated last year
- Official Repository of Absolute Zero Reasoner☆1,813Aug 24, 2025Updated 5 months ago
- ☆1,033Dec 17, 2024Updated last year
- Hierarchical Reasoning Model Official Release☆12,299Sep 9, 2025Updated 5 months ago
- Automating the Search for Artificial Life with Foundation Models!☆450Oct 23, 2025Updated 3 months ago
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆469May 17, 2025Updated 8 months ago
- Minimal reproduction of DeepSeek R1-Zero☆12,715Apr 24, 2025Updated 9 months ago
- Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch☆1,935Updated this week
- ☆215Jan 5, 2026Updated last month
- Official repository of Evolutionary Optimization of Model Merging Recipes☆1,395Nov 29, 2024Updated last year
- Minimalistic large language model 3D-parallelism training☆2,544Dec 11, 2025Updated 2 months ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆3,554Nov 12, 2025Updated 3 months ago
- [NeurIPS 2025] TTRL: Test-Time Reinforcement Learning☆986Sep 26, 2025Updated 4 months ago
- CycleQD is a framework for parameter space model merging.☆48Feb 1, 2025Updated last year
- Official repo of paper LM2☆46Feb 13, 2025Updated last year
- Implementation of SOAR☆49Sep 17, 2025Updated 4 months ago
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)☆542Sep 26, 2025Updated 4 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆940Jun 8, 2025Updated 8 months ago
- LLM as World Models using Bayesian inference☆16May 27, 2025Updated 8 months ago
- [ICLR 2025] Automated Design of Agentic Systems☆1,513Jan 28, 2025Updated last year
- Sky-T1: Train your own O1 preview model within $450☆3,370Jul 12, 2025Updated 7 months ago
- Mamba SSM architecture☆17,153Jan 12, 2026Updated last month
- Official repository of the xLSTM.☆2,102Nov 4, 2025Updated 3 months ago
- s1: Simple test-time scaling☆6,636Jun 25, 2025Updated 7 months ago
- Tools for merging pretrained large language models.☆6,783Jan 26, 2026Updated 2 weeks ago
- code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"☆1,152Nov 9, 2025Updated 3 months ago
- Async RL Training at Scale☆1,071Updated this week
- GRadient-INformed MoE☆264Sep 25, 2024Updated last year
- RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)…☆14,351Updated this week