SakanaAI / continuous-thought-machines
Continuous Thought Machines, because thought takes time and reasoning is a process.
☆492Updated this week
Alternatives and similar repositories for continuous-thought-machines
Users that are interested in continuous-thought-machines are comparing it to the libraries listed below
Sorting:
- Pretraining code for a large-scale depth-recurrent language model☆760Updated last month
- Build your own visual reasoning model☆362Updated this week
- Dream 7B, a large diffusion language model☆630Updated 2 weeks ago
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,058Updated 3 months ago
- Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch☆1,323Updated last month
- Muon optimizer: +>30% sample efficiency with <3% wallclock overhead☆623Updated last month
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆327Updated 5 months ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆450Updated this week
- This repo contains the code for the paper "Intuitive physics understanding emerges fromself-supervised pretraining on natural videos"☆154Updated 2 months ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,109Updated 3 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆307Updated 6 months ago
- Automating the Search for Artificial Life with Foundation Models!☆410Updated 4 months ago
- ☆177Updated 5 months ago
- Code for BLT research paper☆1,587Updated this week
- [ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters☆555Updated 3 months ago
- procedural reasoning datasets☆580Updated this week
- Understanding R1-Zero-Like Training: A Critical Perspective☆925Updated last month
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆307Updated 5 months ago
- Muon is Scalable for LLM Training☆1,044Updated last month
- Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models☆595Updated last month
- Getting crystal-like representations with harmonic loss☆183Updated last month
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,328Updated 3 weeks ago
- noise_step: Training in 1.58b With No Gradient Memory☆219Updated 4 months ago
- Atom of Thoughts for Markov LLM Test-Time Scaling☆563Updated this week
- prime is a framework for efficient, globally distributed training of AI models over the internet.☆743Updated last week
- Official PyTorch implementation for "Large Language Diffusion Models"☆1,592Updated last week
- Recipes to scale inference-time compute of open models☆1,071Updated last week
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆871Updated 2 weeks ago
- System 2 Reasoning Link Collection☆833Updated 2 months ago
- prime-rl is a codebase for decentralized RL training at scale☆211Updated this week