SakanaAI / self-adaptive-llmsLinks
A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!
☆1,136Updated 6 months ago
Alternatives and similar repositories for self-adaptive-llms
Users that are interested in self-adaptive-llms are comparing it to the libraries listed below
Sorting:
- Code for BLT research paper☆1,966Updated 3 months ago
- Pretraining and inference code for a large-scale depth-recurrent language model☆816Updated last month
- Continuous Thought Machines, because thought takes time and reasoning is a process.☆1,277Updated last month
- Self-Adapting Language Models☆770Updated 3 weeks ago
- Recipes to scale inference-time compute of open models☆1,112Updated 3 months ago
- Large Concept Models: Language modeling in a sentence representation space☆2,267Updated 6 months ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,249Updated 2 weeks ago
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆904Updated 3 months ago
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆872Updated 2 weeks ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆319Updated 10 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆842Updated 2 months ago
- Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch☆1,448Updated 2 months ago
- Dream 7B, a large diffusion language model☆915Updated this week
- [ICLR 2025] Automated Design of Agentic Systems☆1,402Updated 7 months ago
- ☆1,033Updated 8 months ago
- An Open Large Reasoning Model for Real-World Solutions☆1,516Updated 2 months ago
- Build your own visual reasoning model☆405Updated last week
- Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents☆1,616Updated 2 weeks ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆2,763Updated this week
- Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.☆658Updated 5 months ago
- prime is a framework for efficient, globally distributed training of AI models over the internet.☆805Updated 3 months ago
- Atom of Thoughts for Markov LLM Test-Time Scaling☆583Updated 2 months ago
- Synthetic data curation for post-training and structured data extraction☆1,483Updated 3 weeks ago
- An Open Source Toolkit For LLM Distillation☆717Updated last month
- OLMoE: Open Mixture-of-Experts Language Models☆845Updated 5 months ago
- Tool for generating high quality Synthetic datasets☆1,139Updated 3 weeks ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆344Updated 8 months ago
- procedural reasoning datasets☆1,060Updated last week
- ☆2,294Updated this week
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation☆420Updated 3 weeks ago