LinWeizheDragon/FLMR

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LinWeizheDragon/FLMR)

LinWeizheDragon / FLMR

The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.

☆108

Alternatives and similar repositories for FLMR

Users that are interested in FLMR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LinWeizheDragon / Retrieval-Augmented-Visual-Question-Answering
View on GitHub
This is the official repository for Retrieval Augmented Visual Question Answering
☆251Dec 19, 2024Updated last year
edchengg / infoseek_eval
View on GitHub
EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions
☆26May 30, 2024Updated 2 years ago
open-vision-language / infoseek
View on GitHub
☆78Oct 27, 2023Updated 2 years ago
Go2Heart / EchoSight
View on GitHub
[EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.
☆87Jan 19, 2026Updated 5 months ago
OpenMatch / UniVL-DR
View on GitHub
[ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Spa…
☆52Jul 3, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
TIGER-AI-Lab / UniIR
View on GitHub
Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)
☆183Oct 1, 2024Updated last year
NJUDeepEngine / CAEF
View on GitHub
Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"
☆11Oct 11, 2024Updated last year
Code-kunkun / LamRA
View on GitHub
[CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant
☆182Jul 7, 2025Updated last year
LgQu / TIGeR
View on GitHub
Code for paper: Unified Text-to-Image Generation and Retrieval
☆16Jul 6, 2024Updated 2 years ago
DataArcTech / RagVL
View on GitHub
Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …
☆92Nov 15, 2024Updated last year
open-vision-language / oven
View on GitHub
☆47Aug 15, 2023Updated 2 years ago
mjeensung / xtr-pytorch
View on GitHub
☆19May 16, 2024Updated 2 years ago
matchyc / mysteryann
View on GitHub
🏆 Winning NeurIPS (NIPS) Competition Track: Big ANN, Practical Vector Search Challenge 2023. （see big-ann-benchmark https://big-ann-benc…
☆30Aug 16, 2024Updated last year
Yushi-Hu / PromptCap
View on GitHub
natual language guided image captioning
☆89Feb 11, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
LuminosityX / FNE
View on GitHub
Implementation of our paper, Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination..
☆20Dec 3, 2023Updated 2 years ago
Omaralsaabi / M3DOCRAG
View on GitHub
An implementation of "M3DOCRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding" by Jaemin Cho, Debanj…
☆56Nov 13, 2024Updated last year
zhangy0822 / USER
View on GitHub
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval, TIP 2024
☆33Jun 18, 2025Updated last year
FreedomIntelligence / TRIM
View on GitHub
We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…
☆22Jan 11, 2026Updated 5 months ago
tdlhl / RAD
View on GitHub
[NeurIPS 2025] This is the official repository for "RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis"
☆27Nov 21, 2025Updated 7 months ago
kite99520 / DialSummEval
View on GitHub
Resources for paper "DialSummEval: Revisiting summarization evaluation for dialogues"
☆14Jul 22, 2025Updated 11 months ago
edchengg / oven_eval
View on GitHub
ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities
☆44Jun 7, 2025Updated last year
liyongqi67 / GRACE
View on GitHub
☆29Aug 25, 2024Updated last year
amazon-science / robust-tableqa
View on GitHub
Two approaches for robust TableQA: 1) ITR is a general-purpose retrieval-based approach for handling long tables in TableQA transformer m…
☆41Aug 21, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
a-antoniades / swe-search
View on GitHub
☆12Nov 5, 2024Updated last year
Aofei-Chang / MedHEval
View on GitHub
Repo for preprint 2025 "MedHEval: Benchmarking Hallucinations and Mitigation Strategies in Medical Large Vision-Language Models"
☆16Apr 23, 2025Updated last year
khoadoan106 / single_loss_quantization
View on GitHub
☆24Jul 12, 2022Updated 3 years ago
zengyan-97 / X2-VLM
View on GitHub
All-In-One VLM: Image + Video + Transfer to Other Languages / Domains (TPAMI 2023)
☆169Aug 22, 2024Updated last year
haoyu-bu / CAFe
View on GitHub
Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"
☆33Mar 26, 2025Updated last year
aimagelab / ReT
View on GitHub
[CVPR 2025] Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval
☆37Sep 12, 2025Updated 9 months ago
tsb0601 / MMVP
View on GitHub
☆363Jan 27, 2024Updated 2 years ago
TengFeiHan0 / CenterMask_plus
View on GitHub
☆14Mar 26, 2020Updated 6 years ago
guilk / KAT
View on GitHub
Research code for "KAT: A Knowledge Augmented Transformer for Vision-and-Language"
☆71Jul 11, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ilkerkesen / frozen
View on GitHub
A PyTorch implementation of Multimodal Few-Shot Learning with Frozen Language Models with OPT.
☆44Jul 23, 2022Updated 3 years ago
PaulLerner / ViQuAE
View on GitHub
Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retriev…
☆39Dec 19, 2024Updated last year
HITsz-TMG / Cognitive-Visual-Language-Mapper
View on GitHub
The codes and datasets about our ACL 2024 Main Conference paper titled "Cognitive Visual-Language Mapper: Advancing Multimodal Comprehens…
☆17Jan 24, 2025Updated last year
OpenGVLab / MM-Interleaved
View on GitHub
MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer
☆254Apr 3, 2024Updated 2 years ago
ZHDXZZQ / Interview-Study
View on GitHub
学习记录
☆11Oct 30, 2024Updated last year
google-deepmind / xtr
View on GitHub
XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval
☆64Jun 20, 2024Updated 2 years ago
rotem-shalev / ImageRAG
View on GitHub
☆105Mar 20, 2026Updated 3 months ago