PKU-YuanGroup / PiCOLinks

[ICLR'25] PiCO: Peer Review in LLMs based on the Consistency Optimization, https://arxiv.org/pdf/2402.01830

☆36

Alternatives and similar repositories for PiCO

Users that are interested in PiCO are comparing it to the libraries listed below

Sorting:

OpenGVLab / MMIU
[ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
☆85Updated 10 months ago
haonan3 / V1
V1: Toward Multimodal Reasoning by Designing Auxiliary Task
☆35Updated 3 months ago
PKU-YuanGroup / GPT-as-Language-Tree
GPT as a Monte Carlo Language Tree: A Probabilistic Perspective
☆45Updated 6 months ago
GAIR-NLP / thinking-with-generated-images
Doodling our way to AGI ✏️ 🖼️ 🧠
☆86Updated 2 months ago
PKU-YuanGroup / Look-Back
This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".
☆34Updated last month
Mr-Loevan / FAST
Fast-Slow Thinking for Large Vision-Language Model Reasoning
☆17Updated 3 months ago
Liuziyu77 / MIA-DPO
Official implement of MIA-DPO
☆63Updated 6 months ago
PKU-YuanGroup / AsFT
Code for the paper "AsFT: Anchoring Safety During LLM Fune-Tuning Within Narrow Safety Basin".
☆25Updated last month
PKU-YuanGroup / Video-Bench
A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!
☆131Updated last year
Dongping-Chen / ISG
(ICLR 2025 Spotlight) Official code repository for Interleaved Scene Graph.
☆27Updated this week
dvlab-research / Prompt-Highlighter
[CVPR 2024] Prompt Highlighter: Interactive Control for Multi-Modal LLMs
☆150Updated last year
YiyangZhou / CSR
[NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models
☆77Updated last year
zwq2018 / Multi-modal-Self-instruct
The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…
☆83Updated 6 months ago
VIStA-H / GPT-4V_Social_Media
GPT-4V(ision) as A Social Media Analysis Engine
☆37Updated 7 months ago
joez17 / VideoNIAH
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
☆48Updated 5 months ago
yihedeng9 / STIC
Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
☆70Updated last year
MME-Benchmarks / MME-Unify
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
☆41Updated 4 months ago
PKU-YuanGroup / N-LoRA
【COLING 2025🔥】Code for the paper "Is Parameter Collision Hindering Continual Learning in LLMs?".
☆35Updated 8 months ago
NUS-TRAIL / NoisyRollout
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆84Updated 2 months ago
DAMO-NLP-SG / CMM
✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
☆46Updated last month
foundation-multimodal-models / CAL
[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
☆57Updated 10 months ago
patrick-tssn / VideoHallucer
VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)
☆36Updated 4 months ago
HKUST-LongGroup / CoMM
Official repository for CoMM Dataset
☆45Updated 7 months ago
haoyu-bu / CAFe
Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"
☆21Updated 4 months ago
UW-Madison-Lee-Lab / CoBSAT
Implementation and dataset for paper "Can MLLMs Perform Text-to-Image In-Context Learning?"
☆40Updated 2 months ago
core-mm / core-mm
☆17Updated last year
AoiDragon / POPE
[EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
☆87Updated last year
inFaaa / Multimodal-Roadmap-for-freshman
本项目用于Multimodal领域新手的学习路线，包括该领域的经典论文，项目及课程。旨在希望学习者在一定的时间内达到对这个领域有较为深刻的认知，能够自己进行的独立研究。
☆20Updated last year
longvideobench / LongVideoBench
[Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.
☆104Updated last year
yu-rp / apiprompting
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
☆98Updated 10 months ago