Aloriosa / srmtLinks
The original Shared Recurrent Memory Transformer implementation
☆26Updated last week
Alternatives and similar repositories for srmt
Users that are interested in srmt are comparing it to the libraries listed below
Sorting:
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆15Updated 3 weeks ago
- ☆40Updated last week
- ☆11Updated 10 months ago
- ☆65Updated 2 months ago
- ☆19Updated this week
- ☆50Updated this week
- How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training☆35Updated last month
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆27Updated last week
- ☆17Updated 3 months ago
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated 2 months ago
- ☆9Updated last month
- Official Code Release for "Training a Generally Curious Agent"☆21Updated 2 weeks ago
- ☆29Updated 3 weeks ago
- ☆24Updated 8 months ago
- ☆13Updated 5 months ago
- ☆68Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆21Updated 2 months ago
- [ACL 2025] Agentic Knowledgeable Self-awareness☆68Updated last month
- ☆49Updated 6 months ago
- MARFT stands for Multi-Agent Reinforcement Fine-Tuning. This repository implements an LLM-based multi-agent reinforcement fine-tuning fra…☆35Updated 2 weeks ago
- ☆19Updated last week
- How to create rational LLM-based agents? Using game-theoretic workflows!☆67Updated 3 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆24Updated 3 months ago
- A repository for research on medium sized language models.☆76Updated last year
- ☆59Updated 10 months ago
- This repository contains popular code generation frameworks such as MapCoder, CodeSIM.☆49Updated last month
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆24Updated 2 months ago
- Official repo of paper LM2☆40Updated 3 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆88Updated last week