samkhur006 / awesome-llm-planning-reasoning
A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning materials.
☆187Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for awesome-llm-planning-reasoning
- ☆72Updated 5 months ago
- ☆137Updated 6 months ago
- Code for the paper 🌳 Tree Search for Language Model Agents☆138Updated 3 months ago
- Can Language Models Solve Olympiad Programming?☆100Updated 3 months ago
- The official evaluation suite and dynamic data release for MixEval.☆224Updated last week
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆161Updated last month
- An Analytical Evaluation Board of Multi-turn LLM Agents☆250Updated 6 months ago
- AWM: Agent Workflow Memory☆205Updated last month
- A banchmark list for evaluation of large language models.☆68Updated 4 months ago
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆110Updated 3 weeks ago
- A benchmark that challenges language models to code solutions for scientific problems☆87Updated this week
- A compilation of the best multi-agent papers☆258Updated 2 weeks ago
- This is a collection of resources for computer-use agents, including videos, blogs, papers, and projects.☆102Updated last week
- RewardBench: the first evaluation tool for reward models.☆431Updated 3 weeks ago
- A platform for developers to simulate research community☆88Updated this week
- ☆247Updated 5 months ago
- FireAct: Toward Language Agent Fine-tuning☆255Updated last year
- ☆116Updated 5 months ago
- ☆90Updated 4 months ago
- ☆226Updated this week
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RL☆204Updated this week
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆191Updated last month
- Official Implementation of Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization☆111Updated 6 months ago
- Official implementation of paper "On the Diagram of Thought" (https://arxiv.org/abs/2409.10038)☆170Updated last month
- This repository contains a LLM benchmark for the social deduction game `Resistance Avalon'☆84Updated 2 months ago
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆158Updated 4 months ago
- KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents☆172Updated last month
- Repository for the paper Stream of Search: Learning to Search in Language☆91Updated 3 months ago
- Reformatted Alignment☆112Updated last month
- A simple unified framework for evaluating LLMs☆145Updated last week