☆43Aug 15, 2023Updated 2 years ago
Alternatives and similar repositories for oven
Users that are interested in oven are comparing it to the libraries listed below
Sorting:
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- ☆68Oct 27, 2023Updated 2 years ago
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆25May 30, 2024Updated last year
- Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retriev…☆38Dec 19, 2024Updated last year
- Source code of paper 'LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval' (WWW 2023)☆22Aug 28, 2023Updated 2 years ago
- MaXM is a suite of test-only benchmarks for multilingual visual question answering in 7 languages: English (en), French (fr), Hindi (hi),…☆13Jan 16, 2024Updated 2 years ago
- [WWW 2025 Oral] ImageScope: Unifying Language-Guided Image Retrieval via Large Multimodal Model Collective Reasoning☆20Jul 2, 2025Updated 8 months ago
- [ICCV 2023] Going Beyond Nouns With Vision & Language Models Using Synthetic Data☆14Sep 30, 2023Updated 2 years ago
- Official repository for the A-OKVQA dataset☆110May 8, 2024Updated last year
- ☆38Feb 28, 2023Updated 3 years ago
- Hierarchical entity typing via multi-level learning to rank☆12Oct 13, 2020Updated 5 years ago
- Code and model for AAAI 2024: UMIE: Unified Multimodal Information Extraction with Instruction Tuning☆46Jun 5, 2024Updated last year
- An automatic MLLM hallucination detection framework☆19Sep 26, 2023Updated 2 years ago
- [ICCV 2023] With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning.☆19Jun 7, 2024Updated last year
- ☆22Jan 14, 2026Updated last month
- [ECCV'22 Poster] Explicit Image Caption Editing☆22Nov 30, 2022Updated 3 years ago
- Must-read papers on Fine-grained Entity Typing☆19Jul 7, 2022Updated 3 years ago
- Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]☆24Aug 13, 2024Updated last year
- Demo for advanced Java final project in 18-19 1 of Canghong Jin☆25Nov 18, 2018Updated 7 years ago
- Source code of paper 'Open Hierarchical Relation Extraction' (NAACL 2021)☆22Mar 4, 2022Updated 4 years ago
- Dataset and code for EMNLP 2022 "Visual Named Entity Linking: A New Dataset and A Baseline"☆27Apr 16, 2023Updated 2 years ago
- The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.☆104May 30, 2025Updated 9 months ago
- ☆63Jan 3, 2025Updated last year
- [CVPR 2025] Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval☆34Sep 12, 2025Updated 5 months ago
- SimMatchV2: Semi-Supervised Learning with Graph Consistency☆22Dec 26, 2023Updated 2 years ago
- Distributed Optimization Infra for learning CLIP models☆27Oct 3, 2024Updated last year
- [ICML'24 Oral] "MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions"☆207Oct 28, 2024Updated last year
- Resource and Code for ICME 2021 paper "MNRE: A Challenge Multimodal Dataset for Neural Relation Extraction with Visual Evidence in Social…☆70Nov 23, 2021Updated 4 years ago
- Dataset and starting code for visual entailment dataset☆119Apr 21, 2022Updated 3 years ago
- Official implementation of our LREC-COLING 2024 paper "Generative Multimodal Entity Linking".☆36Feb 27, 2025Updated last year
- This is the code repo for our paper "Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts".☆45Sep 27, 2025Updated 5 months ago
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆31Jul 9, 2024Updated last year
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆79Jan 19, 2026Updated last month
- Multimodal entity linking for Tweets☆29Aug 30, 2021Updated 4 years ago
- Evaluation and dataset construction code for the CVPR 2025 paper "Vision-Language Models Do Not Understand Negation"☆46Feb 26, 2026Updated last week
- Code for reproducing the ACL'23 paper: Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments☆78May 17, 2025Updated 9 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆37Dec 29, 2024Updated last year
- RareAct: A video dataset of unusual interactions☆33Aug 4, 2020Updated 5 years ago
- ☆33Nov 12, 2018Updated 7 years ago