SakanaAI / self-adaptive-llms
A Self-adaptation Frameworkπ that adapts LLMs for unseen tasks in real-time!
β831Updated 2 weeks ago
Alternatives and similar repositories for self-adaptive-llms:
Users that are interested in self-adaptive-llms are comparing it to the libraries listed below
- Training Large Language Model to Reason in a Continuous Latent Spaceβ746Updated this week
- Code for BLT research paperβ1,353Updated this week
- Distributed Training Over-The-Internetβ866Updated last month
- Large Concept Models: Language modeling in a sentence representation spaceβ1,806Updated this week
- Unofficial implementation of Titans, SOTA memory for transformers, in Pytorchβ911Updated this week
- Recipes to scale inference-time compute of open modelsβ975Updated last week
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"β833Updated last week
- β997Updated last month
- System 2 Reasoning Link Collectionβ751Updated this week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.β282Updated 3 months ago
- prime is a framework for efficient, globally distributed training of AI models over the internet.β626Updated this week
- [ICLR 2025] Automated Design of Agentic Systemsβ1,148Updated this week
- OpenResearcher, an advanced Scientific Research Assistantβ420Updated 3 months ago
- Everything about the SmolLM2 and SmolVLM family of modelsβ1,632Updated this week
- Synthetic Data curation for post-training and structured data extractionβ575Updated this week
- Sky-T1: Train your own O1 preview model within $450β2,214Updated this week
- Search-o1: Agentic Search-Enhanced Large Reasoning Modelsβ515Updated this week
- Minimalistic large language model 3D-parallelism trainingβ1,400Updated this week
- GRadient-INformed MoEβ261Updated 4 months ago
- An Open Source Toolkit For LLM Distillationβ442Updated 3 weeks ago
- Optimizing inference proxy for LLMsβ1,955Updated this week
- OLMoE: Open Mixture-of-Experts Language Modelsβ536Updated last month
- An Open Large Reasoning Model for Real-World Solutionsβ1,411Updated 2 months ago
- nanoGPT style version of Llama 3.1β1,300Updated 5 months ago
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineeringβ601Updated 2 weeks ago
- AIDE: the state-of-the-art machine learning engineer agent, generating machine learning solution code from natural language descriptions.β715Updated this week
- This repository includes the official implementation of OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs.β599Updated last month
- Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"β991Updated 4 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsβ¦β288Updated last month