SakanaAI / self-adaptive-llmsView external linksLinks
A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!
☆1,187Jan 30, 2025Updated last year
Alternatives and similar repositories for self-adaptive-llms
Users that are interested in self-adaptive-llms are comparing it to the libraries listed below
Sorting:
- Training Large Language Model to Reason in a Continuous Latent Space☆1,496Aug 12, 2025Updated 6 months ago
- Code for BLT research paper☆2,028Nov 3, 2025Updated 3 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆347Oct 22, 2024Updated last year
- Pretraining and inference code for a large-scale depth-recurrent language model☆863Dec 29, 2025Updated last month
- The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬☆12,048Dec 19, 2025Updated last month
- Official repository of Evolutionary Optimization of Model Merging Recipes☆1,395Nov 29, 2024Updated last year
- Automating the Search for Artificial Life with Foundation Models!☆450Oct 23, 2025Updated 3 months ago
- Large Concept Models: Language modeling in a sentence representation space☆2,333Jan 29, 2025Updated last year
- Continuous Thought Machines, because thought takes time and reasoning is a process.☆1,761Dec 29, 2025Updated last month
- Tools for merging pretrained large language models.☆6,783Jan 26, 2026Updated 2 weeks ago
- Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch☆1,935Updated this week
- Minimal reproduction of DeepSeek R1-Zero☆12,715Apr 24, 2025Updated 9 months ago
- Minimalistic large language model 3D-parallelism training☆2,544Dec 11, 2025Updated 2 months ago
- Sky-T1: Train your own O1 preview model within $450☆3,370Jul 12, 2025Updated 7 months ago
- ☆229Feb 24, 2025Updated 11 months ago
- Fully open reproduction of DeepSeek-R1☆25,866Nov 24, 2025Updated 2 months ago
- Everything about the SmolLM and SmolVLM family of models☆3,602Jan 13, 2026Updated last month
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆371Dec 12, 2024Updated last year
- s1: Simple test-time scaling☆6,636Jun 25, 2025Updated 7 months ago
- AllenAI's post-training codebase☆3,573Updated this week
- Democratizing Reinforcement Learning for LLMs☆5,081Feb 7, 2026Updated last week
- Optimizing inference proxy for LLMs☆3,324Jan 28, 2026Updated 2 weeks ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆163Apr 13, 2025Updated 10 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆3,084Jan 26, 2026Updated 2 weeks ago
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆3,330Jul 7, 2025Updated 7 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆201Jul 17, 2024Updated last year
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆192Jun 13, 2024Updated last year
- Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your resea…☆5,275Aug 20, 2025Updated 5 months ago
- GRadient-INformed MoE☆264Sep 25, 2024Updated last year
- PyTorch native post-training library☆5,669Updated this week
- Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.☆4,754Jul 18, 2025Updated 6 months ago
- DSPy: The framework for programming—not prompting—language models☆32,156Updated this week
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆940Jun 8, 2025Updated 8 months ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆3,554Nov 12, 2025Updated 3 months ago
- 🤗 smolagents: a barebones library for agents that think in code.☆25,422Jan 23, 2026Updated 3 weeks ago
- An Open Large Reasoning Model for Real-World Solutions☆1,533Feb 3, 2026Updated last week
- Framework for enhancing LLMs for RAG tasks using fine-tuning.☆765Dec 16, 2025Updated last month
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆357Feb 5, 2026Updated last week
- Simple & Scalable Pretraining for Neural Architecture Research☆308Dec 6, 2025Updated 2 months ago