Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)
☆37Dec 29, 2024Updated last year
Alternatives and similar repositories for Middleware
Users that are interested in Middleware are comparing it to the libraries listed below
Sorting:
- [CIKM'24] Reviving the Context: Camera Trap Species Classification as Link Prediction on Multimodal Knowledge Graphs☆12Apr 2, 2025Updated 10 months ago
- Web-grounded natural language instructions☆18Nov 25, 2024Updated last year
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆80Apr 12, 2024Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Feb 23, 2024Updated 2 years ago
- [TMLR'25] "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"☆94Oct 5, 2025Updated 4 months ago
- The official repository for the paper "From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning".☆66Apr 18, 2023Updated 2 years ago
- Paper collections of the continuous effort start from World Models.☆204Jul 6, 2024Updated last year
- Code for Research Project TLDR☆25Jul 28, 2025Updated 7 months ago
- A lightweight script for processing HTML page to markdown format with support for code blocks☆82Apr 14, 2024Updated last year
- [ICML'24] TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks☆32Sep 20, 2024Updated last year
- Visual RAG using less than 300 lines of code.☆30Mar 2, 2024Updated last year
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- Paper collections of methods that using language to interact with environment, including interact with real world, simulated world or WWW…☆129Jul 26, 2023Updated 2 years ago
- ☆54Aug 25, 2023Updated 2 years ago
- PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion☆59Feb 29, 2024Updated 2 years ago
- [ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents☆26Feb 17, 2026Updated last week
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large mult…☆826Feb 3, 2025Updated last year
- [EMNLP'23] Code for Generating Data for Symbolic Language with Large Language Models☆18Oct 21, 2023Updated 2 years ago
- FitBot is an advanced health-centric chatbot powered by LLMs. It seamlessly integrates with the Nutrition endpoint of API Ninjas to provi…☆21Aug 5, 2023Updated 2 years ago
- GenAI Examples☆16Dec 13, 2024Updated last year
- 💙 Unstructured Data Connectors for Haystack 2.0☆17Sep 21, 2023Updated 2 years ago
- Example for Logging LLM Evaluator Prompt Responses☆18Aug 14, 2023Updated 2 years ago
- ☆45Aug 28, 2023Updated 2 years ago
- Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"☆19May 25, 2023Updated 2 years ago
- Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering☆50Aug 2, 2022Updated 3 years ago
- This is the Repository for Geometry Problem Solving Method Evaluation☆26Oct 8, 2024Updated last year
- AI Multi-agent system for real-time, adaptive supply chain coordination and optimization leveraging responsive AI clusters.☆36Mar 28, 2024Updated last year
- Source code of paper 'LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval' (WWW 2023)☆22Aug 28, 2023Updated 2 years ago
- Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!☆53Apr 7, 2025Updated 10 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆53Dec 12, 2024Updated last year
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Jul 3, 2023Updated 2 years ago
- ☆20Apr 24, 2024Updated last year
- ☆26Mar 28, 2024Updated last year
- Automatic Integration for Neural Spatio-Temporal Point Process models (AI-STPP) is a new paradigm for exact, efficient, non-parametric inf…☆25Oct 14, 2024Updated last year
- ☆23Aug 14, 2023Updated 2 years ago
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist w…☆947Nov 5, 2025Updated 3 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆64Oct 19, 2024Updated last year
- Code for ACL2021 long paper: Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases☆29Dec 8, 2021Updated 4 years ago
- ☆30Jun 25, 2024Updated last year