sunblaze-ucb / reasoning_ladder
☆24Updated last week
Alternatives and similar repositories for reasoning_ladder:
Users that are interested in reasoning_ladder are comparing it to the libraries listed below
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆86Updated last month
- ☆24Updated 7 months ago
- Agentic Knowledgeable Self-awareness☆50Updated last week
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆93Updated 6 months ago
- accompany material for sleep time compute paper☆17Updated last week
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆84Updated last month
- Official Code Release for "Training a Generally Curious Agent"☆20Updated 3 weeks ago
- ☆46Updated 2 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 7 months ago
- ☆24Updated last month
- ☆20Updated 4 months ago
- ☆50Updated 5 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆29Updated last month
- ☆48Updated 5 months ago
- ☆55Updated 2 weeks ago
- The code implementation of Symbolic-MoE☆27Updated last month
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year
- ☆114Updated 2 months ago
- Official Repository of Are Your LLMs Capable of Stable Reasoning?☆25Updated last month
- ☆107Updated 3 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆63Updated last month
- Code for paper called Self-Training Elicits Concise Reasoning in Large Language Models☆24Updated this week
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆52Updated 3 weeks ago
- ☆15Updated 2 weeks ago
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆50Updated 2 months ago
- Repo for "Z1: Efficient Test-time Scaling with Code"☆55Updated 2 weeks ago
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆53Updated 6 months ago
- Implementation of "SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models"☆27Updated 2 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆79Updated 3 weeks ago
- ☆36Updated last month