SakanaAI / text-to-loraLinks
Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input
☆859Updated 3 months ago
Alternatives and similar repositories for text-to-lora
Users that are interested in text-to-lora are comparing it to the libraries listed below
Sorting:
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆322Updated 10 months ago
- A Tree Search Library with Flexible API for LLM Inference-Time Scaling☆453Updated last month
- Build your own visual reasoning model☆408Updated 2 weeks ago
- Code release for "LLMs can see and hear without any training"☆450Updated 4 months ago
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆339Updated 2 months ago
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)☆601Updated this week
- Official repository for "DynaSaur: Large Language Agents Beyond Predefined Actions"☆349Updated 8 months ago
- A lightweight, local-first, and free experiment tracking library from Hugging Face 🤗☆850Updated this week
- Self-Adapting Language Models☆781Updated last month
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆428Updated 2 weeks ago
- ☆155Updated 4 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆391Updated last week
- Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.☆661Updated 5 months ago
- Inference, Fine Tuning and many more recipes with Gemma family of models☆267Updated last month
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,141Updated 7 months ago
- GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's T…☆240Updated 2 weeks ago
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆370Updated 2 weeks ago
- ☆226Updated 6 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆343Updated 9 months ago
- Dream 7B, a large diffusion language model☆959Updated 3 weeks ago
- Pretraining and inference code for a large-scale depth-recurrent language model☆826Updated last week
- On the Theoretical Limitations of Embedding-Based Retrieval☆490Updated last week
- ☆596Updated 2 weeks ago
- DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation☆726Updated 2 months ago
- Live-bending a foundation model’s output at neural network level.☆265Updated 5 months ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆123Updated 3 weeks ago
- GRadient-INformed MoE☆264Updated 11 months ago
- ☆175Updated last month
- CodeScientist: An automated scientific discovery system for code-based experiments☆289Updated 2 months ago
- Official repository for "NoLiMa: Long-Context Evaluation Beyond Literal Matching"☆148Updated last month