the official code for "ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases"
☆886Oct 26, 2024Updated last year
Alternatives and similar repositories for ToolAlpaca
Users that are interested in ToolAlpaca are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR'24] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use☆112Mar 21, 2024Updated 2 years ago
- [ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.☆5,603May 21, 2025Updated 10 months ago
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆73May 13, 2025Updated 11 months ago
- ☆919Jul 24, 2024Updated last year
- [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step☆306Apr 3, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels …☆285Aug 19, 2023Updated 2 years ago
- A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.☆227Apr 15, 2025Updated last year
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆71Aug 5, 2025Updated 8 months ago
- ToolBench, an evaluation suite for LLM tool manipulation capabilities.☆173Feb 28, 2024Updated 2 years ago
- AgentTuning: Enabling Generalized Agent Abilities for LLMs☆1,483Oct 31, 2023Updated 2 years ago
- A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)☆3,334Feb 8, 2026Updated 2 months ago
- Paper collection on building and evaluating language model agents via executable language grounding☆365Apr 29, 2024Updated last year
- [EMNLP 2024] RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning☆15May 13, 2025Updated 11 months ago
- We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tunin…☆2,798Dec 12, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.☆1,544Jan 22, 2026Updated 2 months ago
- Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Ziha…☆135Jun 4, 2024Updated last year
- Tool Learning for Big Models, Open-Source Solutions of ChatGPT-Plugins☆2,779Dec 5, 2023Updated 2 years ago
- ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings - NeurIPS 2023 (oral)☆271Apr 18, 2024Updated 2 years ago
- Official code for AAAI2023 paper`Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum`☆46Feb 9, 2025Updated last year
- ☆164Apr 17, 2023Updated 3 years ago
- GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the u…☆770Dec 19, 2023Updated 2 years ago
- NexusRaven-13B, a new SOTA Open-Source LLM for function calling. This repo contains everything for reproducing our evaluation on NexusRav…☆319Sep 29, 2023Updated 2 years ago
- This is the repository for the Tool Learning survey.☆481Aug 9, 2025Updated 8 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Collection of papers for scalable automated alignment.☆93Oct 22, 2024Updated last year
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]☆408May 20, 2024Updated last year
- [NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents☆520Sep 6, 2024Updated last year
- An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.☆1,966Aug 9, 2025Updated 8 months ago
- ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型☆13,724Jan 13, 2025Updated last year
- JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning☆10Nov 3, 2024Updated last year
- A large-scale 7B pretraining language model developed by BaiChuan-Inc.☆5,671Jul 18, 2024Updated last year
- Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、…☆6,649Oct 24, 2024Updated last year
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆12,838Apr 13, 2026Updated last week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集☆3,051Apr 14, 2024Updated 2 years ago
- ☆244Aug 14, 2024Updated last year
- Aligning pretrained language models with instruction data generated by themselves.☆4,586Mar 27, 2023Updated 3 years ago
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)☆70,203Apr 12, 2026Updated last week
- ☆11Jun 11, 2024Updated last year
- OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, …☆6,866Updated this week
- The implementation for CIKM 2024: Towards Completeness-Oriented Tool Retrieval for Large Language Models.☆25Nov 6, 2024Updated last year