facebookresearch / RAM
A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).
☆230Updated last week
Alternatives and similar repositories for RAM
Users that are interested in RAM are comparing it to the libraries listed below
Sorting:
- SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning☆261Updated this week
- 🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.☆355Updated this week
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆193Updated last week
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆173Updated this week
- ☆92Updated 7 months ago
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆187Updated 9 months ago
- Async pipelined version of Verl☆78Updated last month
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆179Updated 2 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆327Updated 5 months ago
- A project to improve skills of large language models☆383Updated this week
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"☆132Updated last month
- Repository for the paper Stream of Search: Learning to Search in Language☆146Updated 3 months ago
- This is the official repository for Inheritune.☆111Updated 3 months ago
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆146Updated 3 weeks ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆108Updated last week
- Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024☆128Updated 2 months ago
- ☆291Updated 2 months ago
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆176Updated last month
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆158Updated 10 months ago
- The HELMET Benchmark☆146Updated last month
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆307Updated 5 months ago
- Repo of paper "Free Process Rewards without Process Labels"☆147Updated 2 months ago
- A brief and partial summary of RLHF algorithms.☆128Updated 2 months ago
- Code for Paper: Learning Adaptive Parallel Reasoning with Language Models☆81Updated 3 weeks ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆105Updated this week
- PyTorch building blocks for the OLMo ecosystem☆212Updated this week
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆156Updated this week
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆204Updated 11 months ago
- ☆110Updated 3 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆134Updated 7 months ago