The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs".
☆11Sep 27, 2024Updated last year
Alternatives and similar repositories for Hallu-PI
Users that are interested in Hallu-PI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A iterative feedback driven benchmark on LLM's instruction following ability☆57May 25, 2026Updated 2 weeks ago
- ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation. AAAI, 2025☆15Aug 25, 2025Updated 9 months ago
- e-SNLI-VE: Corrected Visual-Textual Entailment with Natural Language Explanations☆14Aug 19, 2021Updated 4 years ago
- This repo implements VGG's Comparator Network [1].☆11Sep 4, 2018Updated 7 years ago
- This is the official code repository for the paper: Towards General Continuous Memory for Vision-Language Models.☆28Jul 3, 2025Updated 11 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- AIRS-Bench: an AI Research Science benchmark for quantifying the end-to-end AI research abilities of LLM agents☆95May 5, 2026Updated last month
- implementation of "Salient Object Ranking with Position-Preserved Attention"☆27Nov 10, 2021Updated 4 years ago
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆34Dec 24, 2025Updated 5 months ago
- The official implementation of our NAACL 2024 paper "A Wolf in Sheep’s Clothing: Generalized Nested Jailbreak Prompts can Fool Large Lang…☆160Sep 2, 2025Updated 9 months ago
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆25Nov 29, 2024Updated last year
- extension for fabric to handle prompts through pexpect☆44May 31, 2015Updated 11 years ago
- A Python/Cython package for graph edit distances and graph matching☆13Jan 30, 2023Updated 3 years ago
- [ICLR 2026] Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents☆93Apr 23, 2026Updated last month
- Material parsers and other tools, scripts Initially developed for Grobid Superconductor☆13Feb 21, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆10Jul 2, 2020Updated 5 years ago
- Resources for our IJCAI 2020 paper, TopicKA: Generating Commonsense Knowledge-Aware Dialogue Responses Towards the Recommended Topic Fact☆12Nov 30, 2020Updated 5 years ago
- 原稿用紙;原稿紙;稿紙;日式便箋;UPTEX/UPLATEX 縱書☆10Nov 27, 2019Updated 6 years ago
- Objective metrics for measuring visual texture similarity using STSIM features. Supervised by Thrasos Pappas.☆14Oct 4, 2023Updated 2 years ago
- Reproducible Language Agent Research☆35Jun 25, 2025Updated 11 months ago
- Unofficial implementation algorithms of attention models on SNLI dataset☆33Jun 24, 2018Updated 7 years ago
- ☆19Jul 7, 2025Updated 11 months ago
- Author implementation of Exploring Adversarial Fake Images on Face Manifold (CVPR 2021 oral)☆32Mar 2, 2023Updated 3 years ago
- implement gat with batch☆10Nov 28, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- The repo of the Doc2SoarGraph framework☆10Sep 17, 2024Updated last year
- Self-Knowledge Guided Retrieval Augmentation for Large Language Models (EMNLP Findings 2023)☆27Dec 8, 2023Updated 2 years ago
- Code of CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping☆17Oct 8, 2022Updated 3 years ago
- Sentiment Lexicon Construction☆10Sep 17, 2019Updated 6 years ago
- The implement of ACL2024: "MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization"☆43Jun 15, 2024Updated last year
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆33Aug 5, 2025Updated 10 months ago
- 实现《Multiway Attention Networks for Modeling Sentence Pairs》中的网络模型,可用于问答,句子逻辑推理☆11Apr 13, 2020Updated 6 years ago
- Transfer Learning in Dialogue Benchmarking Toolkit☆14Mar 31, 2023Updated 3 years ago
- ☆10Feb 22, 2023Updated 3 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆11Feb 9, 2026Updated 4 months ago
- Multi Task Learning for Semantic Segmentation, Instance Segmentation and Depth Estimation☆12Jun 12, 2022Updated 3 years ago
- Code for TKDE paper "Learning Relation Prototype from Unlabeled Texts for Long-tail Relation Extraction"☆10Feb 19, 2024Updated 2 years ago
- Unofficial PyTorch implementation of "Composing Good Shots by Exploiting Mutual Relations"☆14May 13, 2022Updated 4 years ago
- Code for "Low Shot Box Correction for Weakly Supervised Object Detection"☆12Nov 22, 2022Updated 3 years ago
- 【算法】通过图像颜色计算图像的相似度☆11Sep 16, 2020Updated 5 years ago
- DataSciBench: An LLM Agent Benchmark for Data Science (Findings of ACL 2026)☆59Jan 21, 2026Updated 4 months ago