Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)
☆37Dec 29, 2024Updated last year
Alternatives and similar repositories for Middleware
Users that are interested in Middleware are comparing it to the libraries listed below
Sorting:
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Feb 23, 2024Updated 2 years ago
- [ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents☆27Feb 17, 2026Updated last month
- Web-grounded natural language instructions☆18Nov 25, 2024Updated last year
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆80Apr 12, 2024Updated last year
- Paper collections of the continuous effort start from World Models.☆206Jul 6, 2024Updated last year
- Code for Research Project TLDR☆25Jul 28, 2025Updated 7 months ago
- [TMLR'25] "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"☆96Oct 5, 2025Updated 5 months ago
- 🤖ConvRe🤯: An Investigation of LLMs’ Inefficacy in Understanding Converse Relations (EMNLP 2023)☆24Oct 10, 2023Updated 2 years ago
- Paper collections of methods that using language to interact with environment, including interact with real world, simulated world or WWW…☆129Jul 26, 2023Updated 2 years ago
- SkillWeaver is a framework to enable web agent self-improvement through environment exploration and skill synthesis.☆117Apr 14, 2025Updated 11 months ago
- [ICML'24] TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks☆32Sep 20, 2024Updated last year
- An Illusion of Progress? Assessing the Current State of Web Agents☆156Jan 2, 2026Updated 2 months ago
- A lightweight script for processing HTML page to markdown format with support for code blocks☆82Apr 14, 2024Updated last year
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large mult…☆834Feb 3, 2025Updated last year
- ☆54Aug 25, 2023Updated 2 years ago
- ☆23Aug 14, 2023Updated 2 years ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆53Dec 12, 2024Updated last year
- [COLM'24] "Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning"☆21Jun 14, 2024Updated last year
- [EMNLP'23] Code for Generating Data for Symbolic Language with Large Language Models☆18Oct 21, 2023Updated 2 years ago
- This is the Repository for Geometry Problem Solving Method Evaluation☆26Oct 8, 2024Updated last year
- "Few-shot In-context Learning for Knowledge Base Question Answering" [ACL2023]☆66Jan 27, 2025Updated last year
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- Code for CVPR22 paper One Step at a Time: Long-Horizon Vision-and-Language Navigation with Milestones☆13Jul 27, 2022Updated 3 years ago
- Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"☆19May 25, 2023Updated 2 years ago
- [arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies☆60Feb 6, 2026Updated last month
- Source code of paper 'LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval' (WWW 2023)☆22Aug 28, 2023Updated 2 years ago
- ☆15Dec 10, 2021Updated 4 years ago
- A Self-Training Framework for Vision-Language Reasoning☆88Jan 23, 2025Updated last year
- ☆18Jan 3, 2025Updated last year
- ☆24Apr 3, 2025Updated 11 months ago
- Paper collection on building and evaluating language model agents via executable language grounding☆366Apr 29, 2024Updated last year
- UniParse: A universal graph-based parsing toolkit☆10Oct 2, 2019Updated 6 years ago
- [ACL 2024] On the Multi-turn Instruction Following for Conversational Web Agents☆17Oct 12, 2024Updated last year
- Implementation and experiments for Partially Supervised NER via Expected Entity Ratio in TACL 2022☆14Nov 7, 2022Updated 3 years ago
- A MBTI test on Large Language Model like GPT-3.☆27May 2, 2022Updated 3 years ago
- Code for ACL2021 long paper: Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases☆29Dec 8, 2021Updated 4 years ago
- [ICLR 2025] Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist☆35Oct 23, 2024Updated last year
- A repo for LLM jailbreak☆14Sep 5, 2023Updated 2 years ago
- ☆34Mar 5, 2026Updated 2 weeks ago