SakanaAI / continuous-thought-machinesLinks
Continuous Thought Machines, because thought takes time and reasoning is a process.
β1,720Updated last month
Alternatives and similar repositories for continuous-thought-machines
Users that are interested in continuous-thought-machines are comparing it to the libraries listed below
Sorting:
- A Self-adaptation Frameworkπ that adapts LLMs for unseen tasks in real-time!β1,182Updated last year
- Unofficial implementation of Titans, SOTA memory for transformers, in Pytorchβ1,905Updated 3 weeks ago
- AlphaGo Moment for Model Architecture Discovery.β1,130Updated last month
- Code for BLT research paperβ2,026Updated 2 months ago
- Pretraining and inference code for a large-scale depth-recurrent language modelβ859Updated last month
- H-Net: Hierarchical Network with Dynamic Chunkingβ807Updated 2 months ago
- Dream 7B, a large diffusion language modelβ1,150Updated 2 months ago
- β615Updated 8 months ago
- Self-Adapting Language Modelsβ1,684Updated 5 months ago
- Darwin GΓΆdel Machine: Open-Ended Evolution of Self-Improving Agentsβ1,802Updated 5 months ago
- Official Repository of Absolute Zero Reasonerβ1,800Updated 5 months ago
- Automating the Search for Artificial Life with Foundation Models!β449Updated 3 months ago
- ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolutionβ812Updated this week
- Muon is an optimizer for hidden layers in neural networksβ2,231Updated last week
- dLLM: Simple Diffusion Language Modelingβ1,633Updated 3 weeks ago
- PyTorch code and models for VJEPA2 self-supervised learning from video.β2,853Updated 5 months ago
- PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learningβ584Updated 2 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the inputβ938Updated 7 months ago
- A Reproduction of GDM's Nested Learning Paperβ603Updated 2 weeks ago
- β6,283Updated last month
- Training Large Language Model to Reason in a Continuous Latent Spaceβ1,478Updated 5 months ago
- A minimal implementation of DeepMind's Genie world modelβ1,118Updated 2 months ago
- β2,553Updated 2 weeks ago
- NanoGPT (124M) in 2 minutesβ4,410Updated last week
- Implementing DeepSeek R1's GRPO algorithm from scratchβ1,754Updated 9 months ago
- Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paperβ793Updated 5 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.β345Updated last year
- Post-training with Tinkerβ2,770Updated this week
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)β538Updated 4 months ago
- Official PyTorch implementation for "Large Language Diffusion Models"β3,522Updated 2 months ago