Aloriosa / srmt
The original Shared Recurrent Memory Transformer implementation
☆23Updated 2 months ago
Alternatives and similar repositories for srmt:
Users that are interested in srmt are comparing it to the libraries listed below
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆54Updated last month
- A testbed for agents and environments that can automatically improve models through data generation.☆23Updated last month
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year
- ☆17Updated last month
- ☆24Updated 7 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated 10 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆93Updated 6 months ago
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆84Updated last month
- ☆62Updated 3 weeks ago
- ☆13Updated 4 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆84Updated 3 weeks ago
- Agentic Knowledgeable Self-awareness☆47Updated this week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 7 months ago
- Vintix: Action Model via In-Context Reinforcement Learning - - —☆34Updated last month
- Code for paper called Self-Training Elicits Concise Reasoning in Large Language Models☆23Updated 3 weeks ago
- This code accompanies the paper "Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration."☆27Updated 5 months ago
- A repository for research on medium sized language models.☆76Updated 10 months ago
- NeurIPS 2024 tutorial on LLM Inference☆41Updated 4 months ago
- ☆48Updated 5 months ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆28Updated last month
- ☆35Updated last month
- Data preparation code for CrystalCoder 7B LLM☆44Updated 11 months ago
- Learn online intrinsic rewards from LLM feedback☆35Updated 4 months ago
- Official Code Release for "Training a Generally Curious Agent"☆20Updated 2 weeks ago
- ☆106Updated 2 months ago
- We develop world models that can be adapted with natural language. Intergrating these models into artificial agents allows humans to effe…☆22Updated last year
- ☆16Updated last month
- Repo to reproduce the First-Explore paper results☆37Updated 3 months ago
- ☆55Updated 8 months ago
- ☆81Updated last year