[ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios
☆70Aug 5, 2025Updated 7 months ago
Alternatives and similar repositories for UltraTool
Users that are interested in UltraTool are comparing it to the libraries listed below
Sorting:
- [ICASSP2024] Code for paper "SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-modal Intent Detection"☆15Jul 6, 2024Updated last year
- [ICLR 2024] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use☆109Mar 21, 2024Updated last year
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆73May 13, 2025Updated 9 months ago
- [COLING 2025] NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models☆18Jan 18, 2025Updated last year
- This repository contains the ToolSelect dataset which was used to fine-tune Llama-2 70B for tool selection.☆22Mar 11, 2024Updated last year
- This is the repository for the Tool Learning survey.☆481Aug 9, 2025Updated 6 months ago
- A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.☆217Apr 15, 2025Updated 10 months ago
- ☆31May 8, 2025Updated 9 months ago
- A framework for evolving and testing question-answering datasets with various models.☆22Feb 28, 2024Updated 2 years ago
- m&ms: A Benchmark to Evaluate Tool-Use for multi-step multi-modal tasks☆45Sep 26, 2024Updated last year
- [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step☆304Apr 3, 2024Updated last year
- Codes and data for KDD 2024 Research Track paper "ProCom: A Few-shot Targeted Community Detection Algorithm"☆11Aug 15, 2024Updated last year
- ☆28Nov 10, 2025Updated 3 months ago
- Official code for AAAI2023 paper`Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum`☆45Feb 9, 2025Updated last year
- ☆123Jun 6, 2024Updated last year
- ☆16Jun 14, 2023Updated 2 years ago
- ☆15Aug 18, 2022Updated 3 years ago
- MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning☆138Oct 10, 2025Updated 4 months ago
- The baseline method for CCIR 22 https://www.datafountain.cn/competitions/573☆13Aug 2, 2022Updated 3 years ago
- Molecular Explanation Generator☆17Jan 26, 2022Updated 4 years ago
- ☆15Nov 5, 2024Updated last year
- ☆18May 14, 2024Updated last year
- [ACL 2024] On the Multi-turn Instruction Following for Conversational Web Agents☆17Oct 12, 2024Updated last year
- ☆14Aug 3, 2021Updated 4 years ago
- Official code and dataset for our EMNLP 2024 Findings paper: Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Kn…☆19Dec 27, 2024Updated last year
- https://jiaweisii.github.io/gorgeous/☆18Feb 24, 2026Updated last week
- ☆917Jul 24, 2024Updated last year
- ☆17Oct 17, 2022Updated 3 years ago
- The official repo for DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph☆18Oct 13, 2024Updated last year
- ☆188Jan 27, 2025Updated last year
- ☆21May 24, 2024Updated last year
- Implementation of "Multi-modal Retrieval Augmented Multi-modal Generation: Datasets, Evaluation Metrics and Strong Baselines"☆30Feb 24, 2025Updated last year
- The code of Paper "Locate Then Ask: Interpretable Stepwise Reasoning for Multi-hop Question Answering".☆22Sep 1, 2022Updated 3 years ago
- [ICCV'25] "Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection".☆25Jan 12, 2026Updated last month
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]☆396May 20, 2024Updated last year
- The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmar…☆53Nov 5, 2024Updated last year
- [NeurIPS 2024] Official implementation for paper "Can Graph Learning Improve Planning in LLM-based Agents?"☆151May 11, 2025Updated 9 months ago
- ☆20Apr 24, 2024Updated last year
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆26Jul 26, 2023Updated 2 years ago