Reading list for Multimodal Large Language Models
☆70Aug 17, 2023Updated 2 years ago
Alternatives and similar repositories for Awesome-Multimodal-LLM
Users that are interested in Awesome-Multimodal-LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Research Trends in LLM-guided Multimodal Learning.☆356Oct 17, 2023Updated 2 years ago
- Code and Data for the ACL21 paper "Modeling Bilingual Conversational Characteristics for Neural Chat Translation"☆12Dec 17, 2021Updated 4 years ago
- [ECCV 2024 Workshop🎈] The first agriculture benchmark to evaluate MM-LLMs.☆26Jan 1, 2025Updated last year
- Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Mod…☆375Mar 19, 2025Updated last year
- [WIP@Oct 13] 质衡-基准测试 (Q-Bench in Chinese),包含中文版【底层视觉问答】和【底层视觉描述】数据集,以及中文提示下的图片质量评价。 We will release Q-Bench in more languages in the futu…☆24Jan 7, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆16May 30, 2025Updated last year
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆27Jun 22, 2026Updated last week
- ☆102Dec 22, 2023Updated 2 years ago
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆14Mar 20, 2024Updated 2 years ago
- gradio bbox labeling tools☆11May 12, 2023Updated 3 years ago
- MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria☆76Oct 16, 2024Updated last year
- [ACMMM2025] Official released code for ALLM4ADD☆42Oct 30, 2025Updated 7 months ago
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆31Oct 23, 2025Updated 8 months ago
- S2R-HDR: A Large-Scale Rendered Dataset for HDR Fusion☆20Feb 11, 2026Updated 4 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆18Feb 12, 2025Updated last year
- ☆55Apr 1, 2024Updated 2 years ago
- [ Arxiv 2023 ] This repository contains the code for "MUPPET: Multi-Modal Few-Shot Temporal Action Detection"☆16Aug 30, 2023Updated 2 years ago
- Unofficial PyTorch Implementation for paper FlashFace☆15Apr 9, 2024Updated 2 years ago
- List of reference,algorithms, applications in SSL in RS (contribution are welcome)☆18May 1, 2023Updated 3 years ago
- This is the official repository for the code and datasets in the paper "Progressive Open Space Expansion for Open-Set Model Attribution",…☆25Oct 22, 2023Updated 2 years ago
- ☆121Oct 8, 2023Updated 2 years ago
- Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration☆1,591Jan 1, 2025Updated last year
- ICCV 2021☆34May 11, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- paddle code convert toolkit☆22Mar 19, 2023Updated 3 years ago
- CLI to convert Scrapbox page to Markdown☆12May 24, 2026Updated last month
- 💬A curated list of incredible amount of publications related to Dialogue Systems especially Chatbots and Chit-chat Systems☆10Dec 5, 2019Updated 6 years ago
- Gradio demo used in our Osprey:Pixel Understanding with Visual Instruction Tuning.☆16Dec 19, 2023Updated 2 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))☆56Feb 6, 2023Updated 3 years ago
- ☆20Nov 27, 2025Updated 7 months ago
- ☆14Sep 20, 2021Updated 4 years ago
- Implemention based on lightrag and nano-graphrag to connect with psql☆15Oct 28, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Awesome List of Vision Language Prompt Papers☆48Nov 9, 2023Updated 2 years ago
- Corpus analyses confrontation☆21Jan 24, 2023Updated 3 years ago
- [EMNLP'23 Oral] ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue PyTorch Implementation☆12Dec 4, 2023Updated 2 years ago
- The source code of paper "CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking"☆79Dec 16, 2022Updated 3 years ago
- XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts☆36Jul 2, 2024Updated last year
- A web application for visualizing the results of social science survey data.☆12May 22, 2020Updated 6 years ago
- [ACL 2025 Main] SceneGenAgent: Precise Industrial Scene Generation with Coding Agent☆37Nov 29, 2024Updated last year