starsuzi / VideoRAG

VideoRAG: Retrieval-Augmented Generation over Video Corpus

☆24

Alternatives and similar repositories for VideoRAG

Users that are interested in VideoRAG are comparing it to the libraries listed below

Sorting:

LgQu / TIGeR
Code for paper: Unified Text-to-Image Generation and Retrieval
☆15Updated 10 months ago
kyegomez / MC-ViT
Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"
☆21Updated last month
alinlab / HOMER
Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).
☆39Updated 9 months ago
IDEA-FinAI / RagVL
Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …
☆75Updated 6 months ago
kyegomez / Mirasol
Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"
☆26Updated 3 months ago
ByungKwanLee / Phantom
[Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…
☆58Updated 7 months ago
RenShuhuai-Andy / TESTA
[EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
☆50Updated last year
SivanDoveh / DAC
Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models
☆27Updated last year
Vision-CAIR / dochaystacks
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents, CVPR 2025
☆18Updated 3 months ago
kkahatapitiya / LangRepo
Language Repository for Long Video Understanding
☆31Updated 11 months ago
luomancs / ReMuQ
a multimodal retrieval dataset
☆22Updated last year
haon-chen / mmE5
☆48Updated 2 months ago
levymsn / ChatIR
Official repository of "Chatting Makes Perfect: Chat-based Image Retrieval"
☆31Updated 3 months ago
Hxyou / IdealGPT
Official Code of IdealGPT
☆35Updated last year
yuhui-zh15 / AutoConverter
Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202…
☆25Updated 2 months ago
hackerchenzhuo / LaKo
[Paper][IJCKG 2022] LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection
☆26Updated last year
om-ai-lab / ZoomEye
ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
☆34Updated 4 months ago
junchen14 / Awesome_ChatGPT_papers
This repository will collect and share awesome ChatGPT related papers and useful tools
☆18Updated 2 years ago
TIGER-AI-Lab / VisualWebInstruct
The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search"
☆24Updated last week
chenzhongwu20 / RuleRAG_ICL_FT
RuleRAG: Rule-guided Retrieval-Augmented Generation with Language Models for Question Answering
☆22Updated 6 months ago
yuezih / Movie101
Narrative movie understanding benchmark
☆70Updated last year
lixinustc / GraphAdapter
The efficient tuning method for VLMs
☆81Updated last year
uds-lsv / MCSE
NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings
☆55Updated 11 months ago
GraphPKU / CoI
Chain of Images for Intuitively Reasoning
☆9Updated last year
TIGER-AI-Lab / ABC
ABC: Achieving Better Control of Multimodal Embeddings using VLMs
☆11Updated last month
showlab / MovieSeq
[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences
☆39Updated 2 months ago
luka-group / mDPO
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆74Updated 6 months ago
edchengg / oven_eval
ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities
☆40Updated 8 months ago
luka-group / vlm-knowledge-conflict
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆42Updated 7 months ago
showlab / Awesome-Long-Context
A curated list of resources about long-context in large-language models and video understanding.
☆31Updated last year