SakanaAI / self-adaptive-llmsLinks
A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!
☆1,132Updated 6 months ago
Alternatives and similar repositories for self-adaptive-llms
Users that are interested in self-adaptive-llms are comparing it to the libraries listed below
Sorting:
- Pretraining and inference code for a large-scale depth-recurrent language model☆808Updated 2 weeks ago
- Code for BLT research paper☆1,760Updated 2 months ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,224Updated 6 months ago
- Continuous Thought Machines, because thought takes time and reasoning is a process.☆1,223Updated 3 weeks ago
- AlphaGo Moment for Model Architecture Discovery.☆794Updated last week
- Self-Adapting Language Models☆743Updated this week
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆901Updated 3 months ago
- Large Concept Models: Language modeling in a sentence representation space☆2,254Updated 6 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆836Updated last month
- Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch☆1,425Updated 2 months ago
- Recipes to scale inference-time compute of open models☆1,110Updated 2 months ago
- Dream 7B, a large diffusion language model☆873Updated last month
- Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents☆1,550Updated last month
- [ICLR 2025] Automated Design of Agentic Systems☆1,395Updated 6 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆318Updated 9 months ago
- ☆1,028Updated 7 months ago
- Verifiers for LLM Reinforcement Learning☆1,690Updated this week
- Atom of Thoughts for Markov LLM Test-Time Scaling☆580Updated last month
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆823Updated last month
- Official PyTorch implementation for "Large Language Diffusion Models"☆2,658Updated last week
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆344Updated 7 months ago
- Synthetic data curation for post-training and structured data extraction☆1,468Updated last week
- System 2 Reasoning Link Collection☆849Updated 4 months ago
- An Open Source Toolkit For LLM Distillation☆698Updated 3 weeks ago
- Build your own visual reasoning model☆401Updated this week
- ☆608Updated 3 weeks ago
- Releases from OpenAI Preparedness☆815Updated this week
- OLMoE: Open Mixture-of-Experts Language Models☆830Updated 4 months ago
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆573Updated this week
- procedural reasoning datasets☆1,012Updated this week