CASIA-IVA-Lab/OPT_Questioner

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CASIA-IVA-Lab/OPT_Questioner)

CASIA-IVA-Lab / OPT_Questioner

Official PyTorch implementation of the paper "Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner"

☆15

Alternatives and similar repositories for OPT_Questioner

Users that are interested in OPT_Questioner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Rubics-Xuan / IVG
View on GitHub
This repo holds the official code and data for "Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with H…
☆15May 21, 2024Updated 2 years ago
CASIA-IVA-Lab / ChatBridge
View on GitHub
ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without rely…
☆55Sep 4, 2023Updated 2 years ago
CASIA-IVA-Lab / SC-Tune
View on GitHub
Official code for CVPR 2024 paper, "SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models"
☆16Apr 22, 2024Updated 2 years ago
CASIA-IVA-Lab / MOSO
View on GitHub
☆35Jun 6, 2023Updated 3 years ago
CASIA-IVA-Lab / COSA
View on GitHub
[ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
☆43Dec 25, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
CASIA-IVA-Lab / MRES
View on GitHub
This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentati…
☆74Jun 3, 2024Updated 2 years ago
SuDIS-ZJU / Data-Quality-for-Vision-Language-Models
View on GitHub
☆35Nov 18, 2025Updated 8 months ago
Rubics-Xuan / Med-DANet
View on GitHub
Med-DANet Series (ECCV 2022 & WACV 2024)
☆13Jan 2, 2024Updated 2 years ago
CASIA-IVA-Lab / VALOR
View on GitHub
[TPAMI2024] Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
☆311Dec 25, 2024Updated last year
SuperBruceJia / NLNet-IQA
View on GitHub
Non-local Modeling for Image Quality Assessment
☆13Dec 20, 2023Updated 2 years ago
CASIA-IVA-Lab / PrefixGrouper
View on GitHub
An efficient GRPO training util.
☆56Jun 13, 2025Updated last year
CASIA-IVA-Lab / VAST
View on GitHub
[NIPS2023] Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
☆302Mar 14, 2024Updated 2 years ago
irvingzhang0512 / open-images-downloader
View on GitHub
☆14Aug 13, 2021Updated 4 years ago
cyzus / thoughtsculpt
View on GitHub
THOUGHTSCULPT, a general reasoning and search method for complex tasks
☆13Dec 13, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ivattyue / Ada-K
View on GitHub
Official code for the ICLR 2025 paper, "Ada-K Routing: Boosting the Efficiency of MoE-based LLMs"
☆12Mar 1, 2025Updated last year
yml-bit / CTA-GAN
View on GitHub
synthesis CTA using CT base GAN model
☆14Aug 16, 2022Updated 3 years ago
Exgc / R1V-Free
View on GitHub
R1V, trained with AI feedback, answers open-ended visual questions.
☆14Apr 12, 2025Updated last year
liuajian / CASIA-FAS-Dataset
View on GitHub
A series of face anti-spoofing datasets, for the convenience of management and benchmarking.
☆17May 12, 2026Updated 2 months ago
gluucose / PCCGAN
View on GitHub
Image2Points: A 3D Point-based Context Clusters GAN for High-Quality PET Image Reconstruction (ICASSP 2024)
☆14Jun 16, 2024Updated 2 years ago
ngun7 / Image-Quality-Assessment
View on GitHub
This project aims to perform a quality check of an image, whether an image is blur or not with a blurriness score along with brightness &…
☆18Nov 19, 2021Updated 4 years ago
zjr2000 / GVL
View on GitHub
Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
☆28Dec 8, 2023Updated 2 years ago
PeterGriffinJin / Heterformer
View on GitHub
Heterformer: Transformer-based Deep Node Representation Learning on Heterogeneous Text-Rich Networks (KDD 2023)
☆28Feb 16, 2024Updated 2 years ago
microsoft / VisionAsAdaptations
View on GitHub
☆15May 11, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
zhenshen-mla / Aesthetic-Emotion-Dataset
View on GitHub
IAE Dataset, produced by Chaoran Cui, Zhen Shen, Jun Yu. A large scale dataset to facilitate multi-task learning for uniﬁed image aesthet…
☆20Sep 23, 2021Updated 4 years ago
DAVEISHAN / TimeBalance
View on GitHub
Placeholder
☆10Jul 17, 2023Updated 3 years ago
Sliver-g / Cardiac-CLIP
View on GitHub
☆27Jan 22, 2026Updated 5 months ago
wangst0181 / SpatialViz-Bench
View on GitHub
☆20Mar 2, 2026Updated 4 months ago
bcmi / Composite-Image-Evaluation
View on GitHub
☆24Feb 19, 2026Updated 5 months ago
zechao-li / SVF-few-shot-segmentation
View on GitHub
☆22May 16, 2023Updated 3 years ago
Tencent-QQMM / Video-CCAM
View on GitHub
A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.
☆73Oct 14, 2024Updated last year
Wenzhuo-Liu / AMPF-Net
View on GitHub
☆24May 11, 2026Updated 2 months ago
IntMeGroup / LMM4LMM
View on GitHub
[ICCV 2025 Highlight] LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs
☆20Nov 16, 2025Updated 8 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HY-SpongeBob / HY-SpongeBob
View on GitHub
☆26May 26, 2026Updated last month
YeolJ00 / Personalized-Aesthetics
View on GitHub
Official PyTorch implementation of "Scaling Up Personalized Image Aesthetic Assessment via Task Vector Customization" (ECCV 2024)
☆34Jun 8, 2026Updated last month
CASIA-IVA-Lab / VRoPE
View on GitHub
[EMNLP 2025 Main] Official implementation of VRoPE: Rotary Position Embedding for Video Large Language Models.
☆28Nov 18, 2025Updated 8 months ago
koustav123 / aesthetics_assessment_using_graphs
View on GitHub
Code for ICPR paper
☆21Nov 22, 2021Updated 4 years ago
jcolano / llama3_single_gpu
View on GitHub
☆13Jul 23, 2024Updated last year
andreineculai / MPC
View on GitHub
☆25May 11, 2022Updated 4 years ago
szzexpoi / POEM
View on GitHub
Official Implementation for CVPR 2023 paper "Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasonin…
☆10Jun 16, 2024Updated 2 years ago