SakanaAI / continuous-thought-machinesLinks
Continuous Thought Machines, because thought takes time and reasoning is a process.
β1,238Updated 3 weeks ago
Alternatives and similar repositories for continuous-thought-machines
Users that are interested in continuous-thought-machines are comparing it to the libraries listed below
Sorting:
- A Self-adaptation Frameworkπ that adapts LLMs for unseen tasks in real-time!β1,132Updated 6 months ago
- Pretraining and inference code for a large-scale depth-recurrent language modelβ808Updated 3 weeks ago
- Code for BLT research paperβ1,765Updated 2 months ago
- Unofficial implementation of Titans, SOTA memory for transformers, in Pytorchβ1,425Updated 2 months ago
- Self-Adapting Language Modelsβ743Updated this week
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the inputβ836Updated last month
- Automating the Search for Artificial Life with Foundation Models!β427Updated 6 months ago
- Dream 7B, a large diffusion language modelβ873Updated last month
- Darwin GΓΆdel Machine: Open-Ended Evolution of Self-Improving Agentsβ1,550Updated last month
- Official Repository of Absolute Zero Reasonerβ1,635Updated last week
- PyTorch code and models for VJEPA2 self-supervised learning from video.β1,972Updated last month
- AlphaGo Moment for Model Architecture Discovery.β960Updated this week
- β402Updated 2 months ago
- Training Large Language Model to Reason in a Continuous Latent Spaceβ1,224Updated 6 months ago
- Muon is an optimizer for hidden layers in neural networksβ1,454Updated 3 weeks ago
- Large Concept Models: Language modeling in a sentence representation spaceβ2,257Updated 6 months ago
- OpenAlpha_Evolve is an open-source Python framework inspired by the groundbreaking research on autonomous coding agents like DeepMind's Aβ¦β866Updated 2 months ago
- The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Searchβ1,479Updated 3 months ago
- procedural reasoning datasetsβ1,012Updated this week
- PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learningβ400Updated 2 weeks ago
- Official PyTorch implementation for "Large Language Diffusion Models"β2,658Updated last week
- Implementing DeepSeek R1's GRPO algorithm from scratchβ1,508Updated 3 months ago
- β2,212Updated last week
- Build your own visual reasoning modelβ401Updated this week
- H-Net: Hierarchical Network with Dynamic Chunkingβ632Updated last week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.β318Updated 9 months ago
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computationβ367Updated this week
- MLGym A New Framework and Benchmark for Advancing AI Research Agentsβ538Updated 2 weeks ago
- Atom of Thoughts for Markov LLM Test-Time Scalingβ580Updated last month
- Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paperβ700Updated last month