hsaest/Agent-Planning-Analysis

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hsaest/Agent-Planning-Analysis)

hsaest / Agent-Planning-Analysis

[NAACL'25] "Revealing the Barriers of Language Agents in Planning"

☆13

Alternatives and similar repositories for Agent-Planning-Analysis

Users that are interested in Agent-Planning-Analysis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Di-viner / LLM-Robustness-to-Irrelevant-Information
View on GitHub
[COLM'24] How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?
☆23Oct 13, 2024Updated last year
language-agent-tutorial / language-agent-tutorial.github.io
View on GitHub
[EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks
☆10Nov 27, 2024Updated last year
TEAM-ARM / arm
View on GitHub
[NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model
☆68Apr 6, 2026Updated 2 months ago
rhyang2021 / ARIA
View on GitHub
Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".
☆30Aug 9, 2025Updated 10 months ago
RenzeLou / Muffin
View on GitHub
MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
☆16Oct 31, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
RenzeLou / AAAR-1.0
View on GitHub
The source code for running LLMs on the AAAR-1.0 benchmark.
☆20Apr 5, 2025Updated last year
jiangjiechen / auction-arena
View on GitHub
Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction A…
☆49Jan 28, 2024Updated 2 years ago
TIGER-AI-Lab / GenAI-Arena
View on GitHub
Interface for GenAI-Arena [NeurIPS24]
☆17Feb 27, 2024Updated 2 years ago
OSU-NLP-Group / Deductive-Beam-Search
View on GitHub
[COLM'24] "Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning"
☆21Jun 14, 2024Updated 2 years ago
OSU-NLP-Group / llm-planning-eval
View on GitHub
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
☆54Feb 23, 2024Updated 2 years ago
facebookresearch / dualformer
View on GitHub
implementation of dualformer
☆25Mar 1, 2025Updated last year
PathMMU-Benchmark / PathMMU
View on GitHub
☆38Dec 11, 2024Updated last year
kaleidoscope257-git / SHUbeamer
View on GitHub
SHUbeamer是为了帮助上海大学师生撰写演示文稿而编写的LaTex Beamer模版文件
☆10Dec 1, 2021Updated 4 years ago
OSU-NLP-Group / LLM-Knowledge-Conflict
View on GitHub
[ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"
☆83Apr 12, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Luo-Jiaming / GIFT_CL
View on GitHub
[CVPR'2025] Synthetic Data is an Elegant GIFT for Continual Vision-Language Models
☆25Jun 29, 2025Updated last year
YujieLu10 / Seeker
View on GitHub
☆11May 24, 2024Updated 2 years ago
allenai / sso
View on GitHub
Repository for Skill Set Optimization
☆14Jul 26, 2024Updated last year
LuckyJQ / dxq-cloud
View on GitHub
丢小墙小程序项目，使用腾讯云开发
☆11Dec 10, 2022Updated 3 years ago
cognitiveailab / GPT-simulator
View on GitHub
☆32Jun 12, 2024Updated 2 years ago
AluminiumOxide / ShuffleNet_Attention_Extend
View on GitHub
add attention mechanism in InvertedResidual block about shuffleNetV2
☆10Mar 2, 2024Updated 2 years ago
mathpn / llm-docsmith
View on GitHub
Generate Python docstrings automatically with LLM and syntax trees
☆20Jun 13, 2025Updated last year
MaxBelitsky / cache-steering
View on GitHub
KV Cache Steering for Inducing Reasoning in Small Language Models
☆49Jul 24, 2025Updated 11 months ago
kohjingyu / search-agents
View on GitHub
Code for the paper 🌳 Tree Search for Language Model Agents
☆223Jul 25, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HKUST-KnowComp / ComHyper
View on GitHub
[EMNLP2020] When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models
☆11Nov 10, 2020Updated 5 years ago
google-deepmind / natural-plan
View on GitHub
☆57Sep 16, 2024Updated last year
Shiguang-Guo / Open-Grounded-Planning
View on GitHub
☆11Jun 11, 2024Updated 2 years ago
dki-lab / few-shot-bioIE
View on GitHub
True Few-Shot BioIE: Benchmarking GPT-3 In-Context and Small PLM Fine-Tuning
☆12Jul 6, 2022Updated 3 years ago
mu-cai / TemporalBench
View on GitHub
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
☆40Nov 10, 2024Updated last year
OSU-NLP-Group / AttrScore
View on GitHub
Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"
☆56Jul 3, 2023Updated 3 years ago
karlsbeard / fduer-manual
View on GitHub
a website from fduers and for fduers
☆18Mar 9, 2025Updated last year
jonathan-roberts1 / SciFIBench
View on GitHub
NeurIPS 2024: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation
☆13May 24, 2025Updated last year
drogozhang / pytorch-TripletSemiHardLoss
View on GitHub
pytorch-TripletSemiHardLoss
☆10Jan 12, 2022Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
OSU-NLP-Group / Online-Mind2Web
View on GitHub
An Illusion of Progress? Assessing the Current State of Web Agents
☆187Jun 25, 2026Updated last week
BinbinQi / Math-Model
View on GitHub
Math Model Code
☆15Apr 25, 2021Updated 5 years ago
TobiasLee / VEC
View on GitHub
Visual and Embodied Concepts evaluation benchmark
☆21Oct 10, 2023Updated 2 years ago
AIM3-RUC / MPMQA
View on GitHub
Official repository of the paper MPMQA: Multimodal Question Answering on Product Manuals (AAAI 2023)
☆21Nov 28, 2022Updated 3 years ago
slhmy / SHUpdtp
View on GitHub
Program design&training platform server for Shanghai University.
☆14Jun 25, 2022Updated 4 years ago
momo-journey / mbart-chinese
View on GitHub
多语言降噪预训练模型MBart的中文生成任务
☆11May 27, 2021Updated 5 years ago
Abbey4799 / CuteGPT
View on GitHub
An open-source conversational language model developed by the Knowledge Works Research Laboratory at Fudan University.
☆64Oct 12, 2023Updated 2 years ago