The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
☆104May 30, 2025Updated 9 months ago
Alternatives and similar repositories for FLMR
Users that are interested in FLMR are comparing it to the libraries listed below
Sorting:
- This is the official repository for Retrieval Augmented Visual Question Answering☆244Dec 19, 2024Updated last year
- ☆68Oct 27, 2023Updated 2 years ago
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆79Jan 19, 2026Updated last month
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆178Oct 1, 2024Updated last year
- ☆19May 16, 2024Updated last year
- [ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Spa…☆53Jul 3, 2024Updated last year
- [CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant☆178Jul 7, 2025Updated 8 months ago
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆91Nov 15, 2024Updated last year
- ☆25Aug 23, 2024Updated last year
- ☆43Aug 15, 2023Updated 2 years ago
- ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities☆43Jun 7, 2025Updated 8 months ago
- ☆12Nov 5, 2024Updated last year
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆54Mar 9, 2025Updated 11 months ago
- "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023☆16Nov 28, 2024Updated last year
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆30Dec 12, 2024Updated last year
- USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval, TIP 2024☆33Jun 18, 2025Updated 8 months ago
- ☆20Nov 20, 2024Updated last year
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- ☆360Jan 27, 2024Updated 2 years ago
- 🏆 Winning NeurIPS (NIPS) Competition Track: Big ANN, Practical Vector Search Challenge 2023. (see big-ann-benchmark https://big-ann-benc…☆30Aug 16, 2024Updated last year
- ☆92Nov 25, 2023Updated 2 years ago
- Code of fine-tuning neural sparse models and training from scratch. #SIGIR2025☆24Feb 6, 2026Updated last month
- [CVPR2023] This is an official implementation of paper "DETRs with Hybrid Matching".☆14Sep 1, 2022Updated 3 years ago
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Jul 6, 2024Updated last year
- Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retriev…☆38Dec 19, 2024Updated last year
- implementation code for 'PLATE: A Prompt-Enhanced Paradigm for Multi-Scenario Recommendations' in SIGIR 2023☆13Sep 27, 2024Updated last year
- [KDD'2024] "HiGPT: Heterogenous Graph Language Models"☆146Jun 5, 2024Updated last year
- ☆17Feb 19, 2024Updated 2 years ago
- ☆44Jul 23, 2024Updated last year
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…☆20Jan 11, 2026Updated last month
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆46Jun 11, 2025Updated 8 months ago
- Code and data of "Controllable Unsupervised Event-based Video Generation" (accepted as ICIP oral and invited by WACV workshop)☆19Nov 5, 2024Updated last year
- Implementation of our paper, Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination..☆20Dec 3, 2023Updated 2 years ago
- [CVPR 2025] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering☆54Jul 14, 2025Updated 7 months ago
- ☆95Mar 3, 2025Updated last year
- [ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"☆194Mar 17, 2025Updated 11 months ago
- ☆51May 11, 2025Updated 9 months ago