KuaiMod / KuaiMod.github.ioLinks

☆65

Alternatives and similar repositories for KuaiMod.github.io

Users that are interested in KuaiMod.github.io are comparing it to the libraries listed below

Sorting:

alipay / Ant-Multi-Modal-Framework
Research Code for Multimodal-Cognition Team in Ant Group
☆172Updated 3 months ago
zhourax / VEGA
☆37Updated last year
QQ-MM / QQMM-embed
☆23Updated 3 months ago
VectorSpaceLab / MegaPairs
[ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval
☆241Updated 3 months ago
JUNJIE99 / VISTA_Evaluation_FineTuning
Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original c…
☆46Updated last year
friedrichor / UNITE
official code for "Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval"
☆42Updated 7 months ago
RLHF-V / RLHF-V
[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
☆306Updated last year
deepglint / UniME
[ACM MM 2025] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"
☆103Updated last month
scenarios / WeMM
☆88Updated last year
FreedomIntelligence / ALLaVA
Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model
☆281Updated last year
yuyq96 / TextHawk
Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models
☆65Updated last year
Code-kunkun / LamRA
[CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant
☆174Updated 7 months ago
PanguIR / MRAGSurvey
A Survey of Multimodal Retrieval-Augmented Generation
☆20Updated 3 months ago
open-compass / MMBench
Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"
☆285Updated 8 months ago
Victorwz / Open-Qwen2VL
[COLM 2025] Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources
☆305Updated 5 months ago
ksOAn6g5 / TaiSu
TaiSu（太素）--a large-scale Chinese multimodal dataset（亿级大规模中文视觉语言预训练数据集）
☆190Updated 2 years ago
yuyq96 / R1-Vision
R1-Vision: Let's first take a look at the image
☆48Updated 11 months ago
BAAI-DCAI / DataOptim
A collection of visual instruction tuning datasets.
☆76Updated last year
TIGER-AI-Lab / VLM2Vec
This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]
☆565Updated last week
JiuTian-VL / JiuTian-LION
[CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge
☆153Updated 5 months ago
X-PLUG / Youku-mPLUG
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
☆302Updated 2 years ago
Liuziyu77 / MMDU
Official repository of MMDU dataset
☆103Updated last year
DataArcTech / RagVL
Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …
☆91Updated last year
OpenGVLab / OmniCorpus
[ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
☆410Updated 9 months ago
mynameischaos / Lion
Lion: Kindling Vision Intelligence within Large Language Models
☆51Updated 2 years ago
infly-ai / INF-MLLM
☆114Updated 3 weeks ago
OpenGVLab / MM-NIAH
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…
☆122Updated last year
benywon / ChiQA
The implementations of various baselines in our CIKM 2022 paper: ChiQA: A Large Scale Image-based Real-World Question Answering Dataset f…
☆33Updated last year
TIGER-AI-Lab / UniIR
Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)
☆177Updated last year
yuxie11 / R2D2
☆168Updated 2 years ago