thunlp / MatPlotAgent
☆63Updated 9 months ago
Alternatives and similar repositories for MatPlotAgent:
Users that are interested in MatPlotAgent are comparing it to the libraries listed below
- [ICLR 2024] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use☆85Updated last year
- RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation.☆125Updated 9 months ago
- Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"☆52Updated 10 months ago
- This is the code repo for our paper "Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented Agents".☆103Updated 5 months ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆57Updated 5 months ago
- InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks (ICML 2024)☆113Updated 3 months ago
- [ACL 2024] On the Multi-turn Instruction Following for Conversational Web Agents☆16Updated 5 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆67Updated last month
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆94Updated last month
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆65Updated 4 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆111Updated 6 months ago
- Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs (ACL 2024)☆60Updated 5 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆132Updated 5 months ago
- [NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?☆120Updated 7 months ago
- ☆44Updated 3 months ago
- [ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning☆218Updated 2 months ago
- Syntax Error-Free and Generalizable Tool Use for LLMs via Finite-State Decoding☆27Updated last year
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆109Updated 8 months ago
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆42Updated 9 months ago
- Code for the 2024 arXiv publication "Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Mo…☆23Updated 9 months ago
- This the implementation of LeCo☆32Updated 2 months ago
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆53Updated last year
- [ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆81Updated last month
- [ICLR 2025] This is the code repo for our ICLR’25 paper "RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rew…☆32Updated last month
- Code for paper Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding☆63Updated 9 months ago
- Official repository for paper "TableBench: A Comprehensive and Complex Benchmark for Table Question Answering"☆42Updated this week
- NaturalCodeBench (Findings of ACL 2024)☆62Updated 5 months ago
- EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.☆132Updated last year
- A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.☆139Updated last week