liaoning97 / REVO-LION

REVO-LION: Evaluating and Refining Vision-Language Instruction Tuning Datasets

☆11

Related projects: ⓘ

showlab / MovieSeq
[ECCV2024] Learning Video Context as Interleaved Multimodal Sequences
☆17Updated 3 weeks ago
archiki / RepARe
☆19Updated 11 months ago
TencentARC / TaCA
Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
☆15Updated last year
codezakh / LilT
[ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning
☆36Updated last year
TencentARC / FLM
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
☆31Updated last year
see-say-segment / sesame
🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"
☆23Updated 3 months ago
PKU-YuanGroup / SVD
☆13Updated this week
UniAdapter / UniAdapter
☆21Updated last year
LaVi-Lab / Visual-Table
Stay tuned!
☆11Updated 5 months ago
UW-Madison-Lee-Lab / CoBSAT
Implementation and dataset for paper "Can MLLMs Perform Text-to-Image In-Context Learning?"
☆22Updated last month
zhiheLu / Ensemble_VLM
Official code for paper "Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models, ICML2024"
☆19Updated 4 months ago
MengLcool / DeepStack-VL
☆31Updated 3 months ago
jeykigung / HiCLIP
☆29Updated last year
JiuTian-VL / MoME
☆13Updated 2 months ago
feizc / Vespa
Video Diffusion State Space Models
☆19Updated 5 months ago
TencentARC / pi-Tuning
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
☆32Updated last year
mightyzau / InfMLLM
☆20Updated 9 months ago
mshukor / eP-ALM
[ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.
☆27Updated 10 months ago
jiquan123 / TIER
TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment
☆9Updated 8 months ago
aszala / VPEval
VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)
☆42Updated 9 months ago
Vision-CAIR / InfiniBench
☆11Updated 2 months ago
eric-ai-lab / Discffusion
Official repo for the paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"
☆26Updated 4 months ago
ChocoWu / SeTok
☆32Updated 3 months ago
eslambakr / HRS_benchmark
☆55Updated 11 months ago
showlab / datacentric.vlp
Compress conventional Vision-Language Pre-training data
☆49Updated 11 months ago
princetonvisualai / pointingqa
Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"
☆18Updated last year
chenshuang-zhang / imagenet_d
☆36Updated 4 months ago
deeplearning-wisc / NSCL
Code for ICML 2023 paper "When and How Does Known Class Help Discover Unknown Ones? Provable Understandings Through Spectral Analysis"
☆13Updated last year
wusize / F-LMM
Code Release of F-LMM: Grounding Frozen Large Multimodal Models
☆35Updated last month