OpenGVLab / Awesome-LLM4Tool
A curated list of the papers, repositories, tutorials, and anythings related to the large language models for tools
☆64Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Awesome-LLM4Tool
- This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"☆85Updated last month
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"☆38Updated 3 months ago
- Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆22Updated last month
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆46Updated 3 weeks ago
- Touchstone: Evaluating Vision-Language Models by Language Models☆76Updated 9 months ago
- ☆45Updated last year
- ☆56Updated 9 months ago
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆66Updated 4 months ago
- ☆72Updated 8 months ago
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆37Updated 6 months ago
- Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs☆41Updated 4 months ago
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆41Updated 4 months ago
- MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.☆68Updated 3 weeks ago
- ☆65Updated last year
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆43Updated last week
- An Easy-to-use Hallucination Detection Framework for LLMs.☆49Updated 6 months ago
- ☆24Updated 9 months ago
- The Official Code Repository for GUI-World.☆36Updated 3 months ago
- Towards Large Multimodal Models as Visual Foundation Agents☆113Updated last week
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆100Updated 7 months ago
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆43Updated 7 months ago
- A curated list of resources about long-context in large-language models and video understanding.☆30Updated last year
- GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes fr…☆64Updated 4 months ago
- PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion☆45Updated 8 months ago
- Recent advancements propelled by large language models (LLMs), encompassing an array of domains including Vision, Audio, Agent, Robotics,…☆110Updated 2 weeks ago
- ☆29Updated this week
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆56Updated 8 months ago
- ☆37Updated 5 months ago
- Vision Large Language Models trained on M3IT instruction tuning dataset☆17Updated last year
- ☆84Updated 10 months ago