jun0wanan / awesome-large-multimodal-agents
☆294Updated 3 months ago
Related projects: ⓘ
- The model, data and code for the visual GUI Agent SeeClick☆182Updated 3 weeks ago
- Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Mod…☆247Updated last month
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆321Updated 9 months ago
- Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.☆510Updated this week
- Efficient Multimodal Large Language Models: A Survey☆230Updated last month
- Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing imag…☆441Updated 4 months ago
- This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for E…☆323Updated last week
- Papers and resources on Controllable Generation using Diffusion Models, including ControlNet, DreamBooth, T2I-Adapter, IP-Adapter.☆355Updated this week
- papers related to LLM-agent that published on top conferences☆302Updated 7 months ago
- [NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents☆297Updated 5 months ago
- 💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.☆122Updated this week
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.☆303Updated 2 months ago
- ✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models☆422Updated 2 months ago
- 📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).☆363Updated this week
- [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step☆209Updated 5 months ago
- Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"☆142Updated 2 weeks ago
- This is the repository for the Tool Learning survey.☆193Updated 2 weeks ago
- A Framework of Small-scale Large Multimodal Models☆568Updated last week
- [NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"☆500Updated 7 months ago
- paper: https://arxiv.org/abs/2307.02469 page: https://lynx-llm.github.io/☆226Updated last year
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"☆224Updated 2 months ago
- [CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allo…☆255Updated 3 weeks ago
- [CVPR 2024] OneLLM: One Framework to Align All Modalities with Language☆553Updated this week
- This repository contains the paper list for the paper: Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reaso…☆329Updated 9 months ago
- Open Platform for Embodied Agents☆241Updated last week
- RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness☆200Updated last week
- [ACL 2024] A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future☆275Updated 2 months ago
- A collection of recent papers on building autonomous agent. Two topics included: RL-based / LLM-based agents.☆518Updated last month
- ☆112Updated 4 months ago
- [CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback☆218Updated last week