AtsuMiyai / UPDView external linksLinks
[ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models
☆80May 29, 2025Updated 8 months ago
Alternatives and similar repositories for UPD
Users that are interested in UPD are comparing it to the libraries listed below
Sorting:
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆63Oct 19, 2024Updated last year
- Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey [Miyai+, TMLR2025]☆98Jun 16, 2025Updated 7 months ago
- [ICML 2024] "Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection"☆14Feb 15, 2025Updated last year
- Follow-Up Differential Descriptions: Language Models Resolve Ambiguities for Image Classification☆11Nov 15, 2023Updated 2 years ago
- ☆25Sep 19, 2023Updated 2 years ago
- Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…☆27Mar 29, 2024Updated last year
- PyTorch Implementation of "ASTRA: An Action Spotting TRAnsformer for Soccer Videos", ACM MMSports 2023. | 3rd place solution for SoccerNe…☆41May 20, 2024Updated last year
- Multiple Transformation Function Estimation for Image Enhancement☆22Oct 20, 2024Updated last year
- [AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues☆60May 2, 2025Updated 9 months ago
- This repository holds the "Fully automated landmarking and facial segmentation on 3D photographs" files☆30Oct 23, 2023Updated 2 years ago
- (ECCV 2024) Can OOD Object Detectors Learn from Foundation Models?☆25Dec 7, 2024Updated last year
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆155Apr 30, 2024Updated last year
- Official Pytorch Implementation of Self-emerging Token Labeling☆35Mar 27, 2024Updated last year
- [CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(…☆325Oct 14, 2025Updated 4 months ago
- 4D Panoptic Scene Graph Generation (NeurIPS'23 Spotlight)☆117Mar 13, 2025Updated 11 months ago
- DALI Multi Agent System Framework☆42Jan 30, 2026Updated 2 weeks ago
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)☆26Feb 25, 2025Updated 11 months ago
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Jul 12, 2024Updated last year
- [NeurIPS VLM workshop 2024] In-Context Ensemble Learning from Pseudo Labels Improves Video-Language Models for Low-Level Workflow Underst…☆23Mar 16, 2025Updated 10 months ago
- Repository for the SPDTransNet model, a Transformer-based architecture to analyze sequences of SPD matrices without loss of their Riemann…☆35Oct 15, 2024Updated last year
- [TMLR'24] This repository includes the official implementation our paper "FedConv: Enhancing Convolutional Neural Networks for Handling D…☆25Apr 30, 2024Updated last year
- This repository is the project page for "Point Anywhere: Directed Object Estimation from Omnidirectional Images", including source code …☆12Aug 25, 2023Updated 2 years ago
- ☆88Jan 10, 2024Updated 2 years ago
- [CVPR 2024] GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding☆44Mar 15, 2024Updated last year
- This repository contains the resource introduced in the paper: "Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-Oasis"…☆25Oct 15, 2025Updated 3 months ago
- VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)☆42Dec 16, 2025Updated last month
- Chain-of-Spot: Interactive Reasoning Improves Large Vision-language Models☆99Mar 22, 2024Updated last year
- NightSurveillance Sataset for Pedestrian Detection☆11Jul 30, 2020Updated 5 years ago
- A simple python package to stretch audio files and change their speed☆12Jan 16, 2026Updated 3 weeks ago
- 🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".☆47Mar 18, 2025Updated 10 months ago
- [ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.☆19Sep 24, 2025Updated 4 months ago
- Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts in Environment and Sensor Domains (CVPR 2024)☆10Jan 17, 2026Updated 3 weeks ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆48Feb 27, 2025Updated 11 months ago
- A curated list of papers & resources linked to concept learning☆12Aug 9, 2023Updated 2 years ago
- Benchmarking and Analyzing Generative Data for Visual Recognition☆26Jul 25, 2023Updated 2 years ago
- [NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"☆105Nov 9, 2023Updated 2 years ago
- Code for Greedy Gradient Ensemble for Visual Question Answering (ICCV 2021, Oral)☆27Mar 28, 2022Updated 3 years ago
- [CVPR 2024] Offical implemention of the paper "DePT: Decoupled Prompt Tuning"☆109Nov 24, 2025Updated 2 months ago
- ☆43May 6, 2024Updated last year