SakanaAI / continuous-thought-machinesLinks
Continuous Thought Machines, because thought takes time and reasoning is a process.
β1,277Updated last month
Alternatives and similar repositories for continuous-thought-machines
Users that are interested in continuous-thought-machines are comparing it to the libraries listed below
Sorting:
- A Self-adaptation Frameworkπ that adapts LLMs for unseen tasks in real-time!β1,136Updated 6 months ago
- Pretraining and inference code for a large-scale depth-recurrent language modelβ816Updated last month
- Self-Adapting Language Modelsβ770Updated 3 weeks ago
- Unofficial implementation of Titans, SOTA memory for transformers, in Pytorchβ1,448Updated 2 months ago
- Code for BLT research paperβ1,966Updated 3 months ago
- Dream 7B, a large diffusion language modelβ915Updated this week
- β443Updated 3 months ago
- Darwin GΓΆdel Machine: Open-Ended Evolution of Self-Improving Agentsβ1,616Updated 2 weeks ago
- Muon is an optimizer for hidden layers in neural networksβ1,595Updated last month
- Training Large Language Model to Reason in a Continuous Latent Spaceβ1,249Updated 2 weeks ago
- AlphaGo Moment for Model Architecture Discovery.β1,045Updated 3 weeks ago
- Official Repository of Absolute Zero Reasonerβ1,669Updated this week
- Automating the Search for Artificial Life with Foundation Models!β427Updated 7 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the inputβ842Updated 2 months ago
- Official PyTorch implementation for "Large Language Diffusion Models"β2,763Updated this week
- PyTorch code and models for VJEPA2 self-supervised learning from video.β2,090Updated last week
- Build your own visual reasoning modelβ405Updated last week
- H-Net: Hierarchical Network with Dynamic Chunkingβ669Updated 3 weeks ago
- procedural reasoning datasetsβ1,060Updated last week
- Implementing DeepSeek R1's GRPO algorithm from scratchβ1,537Updated 4 months ago
- Recipes to scale inference-time compute of open modelsβ1,112Updated 3 months ago
- MLGym A New Framework and Benchmark for Advancing AI Research Agentsβ546Updated 2 weeks ago
- Decentralized RL Training at Scaleβ472Updated this week
- PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learningβ428Updated last month
- MMaDA - Open-Sourced Multimodal Large Diffusion Language Modelsβ1,323Updated last week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.β319Updated 10 months ago
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modelingβ904Updated 3 months ago
- Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"β560Updated last year
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computationβ420Updated 3 weeks ago
- Open-source implementation of AlphaEvolveβ3,727Updated this week