heaplax / ARMAP
☆21Updated 2 months ago
Alternatives and similar repositories for ARMAP
Users that are interested in ARMAP are comparing it to the libraries listed below
Sorting:
- ☆38Updated last week
- Natural Language Reinforcement Learning☆87Updated 4 months ago
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆56Updated 4 months ago
- The official implementation of Self-Exploring Language Models (SELM)☆64Updated 11 months ago
- [ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision☆62Updated 10 months ago
- ☆40Updated 6 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆94Updated last month
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆34Updated last week
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆48Updated 6 months ago
- ☆128Updated 10 months ago
- ☆45Updated 3 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year
- ☆58Updated 2 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆81Updated last month
- official implementation of paper "Process Reward Model with Q-value Rankings"☆57Updated 3 months ago
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity