MAT: Multi-modal Agent Tuning 🔥 ICLR 2025 (Spotlight)
☆90Dec 18, 2025Updated 3 months ago
Alternatives and similar repositories for MAT-Agent
Users that are interested in MAT-Agent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Nov 5, 2024Updated last year
- 【ICLR 2025 🔥】MMKE-Bench, a challenging benchmark for evaluating diverse semantic editing in real-world scenarios.☆21Apr 19, 2025Updated 11 months ago
- ☆63Dec 5, 2025Updated 3 months ago
- ☆30Jun 19, 2024Updated last year
- [AAAI 2026]Release of code, datasets and model for our work TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for General…☆77Dec 1, 2025Updated 3 months ago
- A powerful automation agent for macOS that enables natural language control of various system applications and services. This agent allow…☆57Jun 5, 2025Updated 9 months ago
- ☆56Oct 3, 2024Updated last year
- ☆15Jun 6, 2024Updated last year
- Code for ACM MM 2024 paper "A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning"☆20Dec 5, 2024Updated last year
- ☆13Sep 14, 2022Updated 3 years ago
- ☆22May 23, 2025Updated 10 months ago
- Analyzing LLM Alignment via Token distribution shift☆17Jan 26, 2024Updated 2 years ago
- Lab tasks for the course on "Data Engineering for Machine Learning"☆10May 1, 2023Updated 2 years ago
- 基于开源软件anki的二次开发,简化了部分操作,“傻瓜式”英语学习软件☆15Dec 8, 2022Updated 3 years ago
- ☆17Nov 1, 2024Updated last year
- ☆68Sep 15, 2025Updated 6 months ago
- MemoryEQA☆24Nov 18, 2025Updated 4 months ago
- General-purpose Visual Understanding Evaluation☆20Dec 21, 2023Updated 2 years ago
- code of the CVPR 2020 paper "Learning to Optimize on SPD Manifolds"☆13Sep 12, 2020Updated 5 years ago
- This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)☆304Dec 5, 2024Updated last year
- ELIXIR: Learning from User Feedback on Explanations to Improve Recommender Models☆10Feb 15, 2021Updated 5 years ago
- ☆10Nov 21, 2023Updated 2 years ago
- R1-Vision: Let's first take a look at the image☆48Feb 16, 2025Updated last year
- ☆17Apr 19, 2021Updated 4 years ago
- Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…☆1,373Mar 9, 2026Updated 2 weeks ago
- [MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆154Jul 22, 2025Updated 8 months ago
- Official implementation of our paper "Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Opera…☆11Sep 20, 2024Updated last year
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- ☆48Oct 28, 2025Updated 4 months ago
- A customizable lightweight Grad-CAM implementation☆16Nov 30, 2019Updated 6 years ago
- ☆56Nov 30, 2025Updated 3 months ago
- ☆121Jul 22, 2025Updated 8 months ago
- Code for "APTBench: Benchmarking Agentic Potential of Base LLMs During Pre-Training"☆39Dec 23, 2025Updated 3 months ago
- Code of CVPR 2023 paper Meta-causal Learning for Single Domain Generalization☆20Oct 7, 2023Updated 2 years ago
- ReproZip for the Preservation of Web Applications☆17May 6, 2024Updated last year
- The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"☆87Oct 15, 2025Updated 5 months ago
- Framework of DataLog Neural Program Synthesis☆26Apr 2, 2019Updated 6 years ago
- ☆14Jun 21, 2019Updated 6 years ago
- Code for COLING 2020 paper "Controllable Abstractive Sentence Summarization with Guiding Entities"☆12Dec 24, 2020Updated 5 years ago