SakanaAI / continuous-thought-machinesLinks
Continuous Thought Machines, because thought takes time and reasoning is a process.
☆1,309Updated 2 months ago
Alternatives and similar repositories for continuous-thought-machines
Users that are interested in continuous-thought-machines are comparing it to the libraries listed below
Sorting:
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,151Updated 8 months ago
- Pretraining and inference code for a large-scale depth-recurrent language model☆830Updated last month
- Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch☆1,465Updated 4 months ago
- Code for BLT research paper☆1,987Updated 4 months ago
- Self-Adapting Language Models☆800Updated 2 months ago
- AlphaGo Moment for Model Architecture Discovery.☆1,087Updated 2 months ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,278Updated last month
- Dream 7B, a large diffusion language model☆991Updated 2 weeks ago
- Official Repository of Absolute Zero Reasoner☆1,699Updated last month
- PyTorch code and models for VJEPA2 self-supervised learning from video.☆2,269Updated last month
- Muon is an optimizer for hidden layers in neural networks☆1,803Updated 2 months ago
- H-Net: Hierarchical Network with Dynamic Chunking☆744Updated last week
- ☆478Updated 4 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆881Updated 4 months ago
- Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents☆1,684Updated last month
- ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolution☆478Updated last week
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,168Updated this week
- PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learning☆524Updated last week
- Automating the Search for Artificial Life with Foundation Models!☆429Updated 8 months ago
- ☆999Updated this week
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,596Updated 5 months ago
- ☆2,361Updated last week
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆342Updated 9 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆322Updated 11 months ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆2,996Updated last week
- Async RL Training at Scale☆669Updated this week
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆535Updated this week
- A minimal implementation of DeepMind's Genie world model☆904Updated last week
- ☆1,283Updated 3 weeks ago
- prime is a framework for efficient, globally distributed training of AI models over the internet.☆828Updated 4 months ago