[NAACL'25] "Revealing the Barriers of Language Agents in Planning"
☆13Jun 22, 2025Updated 8 months ago
Alternatives and similar repositories for Agent-Planning-Analysis
Users that are interested in Agent-Planning-Analysis are comparing it to the libraries listed below
Sorting:
- [COLM'24] "How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?"☆22Oct 13, 2024Updated last year
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- Interface for GenAI-Arena [NeurIPS24]☆17Feb 27, 2024Updated 2 years ago
- Agent-based implementation of RAG, incorporating AI agents into the RAG pipeline to orchestrate its components and perform additional act…☆19Feb 20, 2025Updated last year
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆65Oct 26, 2025Updated 4 months ago
- MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following☆16Oct 31, 2024Updated last year
- The source code for running LLMs on the AAAR-1.0 benchmark.☆18Apr 5, 2025Updated 10 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆26Aug 9, 2025Updated 6 months ago
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆22Oct 10, 2024Updated last year
- implementation of dualformer☆24Mar 1, 2025Updated last year
- [AAAI 2026] Multimodal Deepresearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework☆45Jan 25, 2026Updated last month
- [COLM'24] "Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning"☆21Jun 14, 2024Updated last year
- Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction A…☆49Jan 28, 2024Updated 2 years ago
- ☆26Jul 8, 2025Updated 7 months ago
- Evaluating the faithfulness of long-context language models☆30Oct 21, 2024Updated last year
- ☆27Jan 22, 2025Updated last year
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs☆29May 22, 2025Updated 9 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Feb 23, 2024Updated 2 years ago
- ☆31Jun 12, 2024Updated last year
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Feb 24, 2026Updated last week
- [NeurIPS 2025] A multimodal agent that can interact with its own PC in a multimodal manner.☆35Updated this week
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆32Apr 9, 2025Updated 10 months ago
- KV Cache Steering for Inducing Reasoning in Small Language Models☆46Jul 24, 2025Updated 7 months ago
- ☆33Dec 11, 2024Updated last year
- TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models☆37Nov 10, 2024Updated last year
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- ☆18Jun 10, 2025Updated 8 months ago
- Study and research with your docs, media, and AI in one place☆33Updated this week
- SHUbeamer是为了帮助上海大学师生撰写演示文稿而编写的LaTex Beamer模版文件☆10Dec 1, 2021Updated 4 years ago
- Code for the paper 🌳 Tree Search for Language Model Agents☆220Jul 25, 2024Updated last year
- Multi-Granularity LLM Debugger [ICSE2026]☆96Jul 6, 2025Updated 7 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆234Jul 19, 2025Updated 7 months ago
- ☆13Nov 21, 2025Updated 3 months ago
- column generation implementation based on google or-tools for cutting stock problem☆14Aug 19, 2025Updated 6 months ago
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆29Oct 23, 2025Updated 4 months ago
- ☆13Nov 5, 2024Updated last year
- This project is a Token Sale dApp that allows one to buy tokens and also displays recently minted tokens on the Solana blockchain using t…☆11Jul 30, 2024Updated last year
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 4 months ago