The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
☆106May 30, 2025Updated 10 months ago
Alternatives and similar repositories for FLMR
Users that are interested in FLMR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is the official repository for Retrieval Augmented Visual Question Answering☆248Dec 19, 2024Updated last year
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆25May 30, 2024Updated last year
- ☆69Oct 27, 2023Updated 2 years ago
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆82Jan 19, 2026Updated 2 months ago
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆179Oct 1, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- [CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant☆180Jul 7, 2025Updated 9 months ago
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Jul 6, 2024Updated last year
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆91Nov 15, 2024Updated last year
- ☆44Aug 15, 2023Updated 2 years ago
- ☆19May 16, 2024Updated last year
- natual language guided image captioning☆87Feb 11, 2024Updated 2 years ago
- Implementation of our paper, Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination..☆20Dec 3, 2023Updated 2 years ago
- USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval, TIP 2024☆33Jun 18, 2025Updated 9 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…☆22Jan 11, 2026Updated 3 months ago
- [CVPR 2025] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering☆55Jul 14, 2025Updated 9 months ago
- ☆52May 11, 2025Updated 11 months ago
- Two approaches for robust TableQA: 1) ITR is a general-purpose retrieval-based approach for handling long tables in TableQA transformer m…☆41Aug 21, 2023Updated 2 years ago
- ☆12Nov 5, 2024Updated last year
- ☆25Aug 23, 2024Updated last year
- ☆24Jul 12, 2022Updated 3 years ago
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆33Mar 26, 2025Updated last year
- [CVPR 2025] Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval☆36Sep 12, 2025Updated 7 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆360Jan 27, 2024Updated 2 years ago
- ☆14Mar 26, 2020Updated 6 years ago
- (ACL 2025) MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale☆49Jun 4, 2025Updated 10 months ago
- 🌏 Modular retrievers for zero-shot multilingual IR.☆30Mar 6, 2024Updated 2 years ago
- Research code for "KAT: A Knowledge Augmented Transformer for Vision-and-Language"☆70Jul 11, 2022Updated 3 years ago
- A PyTorch implementation of Multimodal Few-Shot Learning with Frozen Language Models with OPT.☆44Jul 23, 2022Updated 3 years ago
- Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retriev…☆38Dec 19, 2024Updated last year
- 【CVPR 2025】SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting☆16Jul 1, 2025Updated 9 months ago
- XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval☆61Jun 20, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code of the paper Relation-enhanced Negative Sampling for Multimodal Knowledge Graph Completion (ACM MM22))☆27May 22, 2024Updated last year
- 学习记录☆11Oct 30, 2024Updated last year
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆30Dec 12, 2024Updated last year
- Implementation of "Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation"☆82Jul 31, 2023Updated 2 years ago
- [IJCAI-24] Explore Internal and External Similarity for Single Image Deraining with Graph Neural Networks☆10Sep 2, 2024Updated last year
- The official code and model for ACL 2023 paper 'mCLIP: Multilingual CLIP via Cross-lingual Transfer'☆10Jan 23, 2024Updated 2 years ago
- Code for our ACL-2023 paper: "Combo of Thinking and Observing for Outside-Knowledge VQA"☆12Jun 30, 2023Updated 2 years ago