yuhui-zh15 / C3View external linksLinks
Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)
☆34Oct 16, 2024Updated last year
Alternatives and similar repositories for C3
Users that are interested in C3 are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2023] Official Pytorch code for LOVM: Language-Only Vision Model Selection☆21Feb 3, 2024Updated 2 years ago
- A Vision-Language Benchmark for Microscopy Understanding☆30Mar 13, 2025Updated 11 months ago
- [CVPR 2025] MicroVQA eval and 🤖RefineBot code for "MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research"…☆32Nov 25, 2025Updated 2 months ago
- Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)☆130Nov 5, 2025Updated 3 months ago
- Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)☆34Jun 8, 2023Updated 2 years ago
- [EMNLP 2024] IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning☆15May 13, 2025Updated 9 months ago
- ☆59Aug 30, 2023Updated 2 years ago
- Official implementation of "Why are Visually-Grounded Language Models Bad at Image Classification?" (NeurIPS 2024)☆96Oct 19, 2024Updated last year
- ☆35Feb 5, 2024Updated 2 years ago
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202…☆40May 26, 2025Updated 8 months ago
- ☆41Sep 9, 2025Updated 5 months ago
- ☆18Oct 28, 2025Updated 3 months ago
- [ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision☆72Jul 10, 2024Updated last year
- SotA text-only image/video method (IJCAI 2023)☆16Jan 9, 2024Updated 2 years ago
- Github repository for ACL 2025 paper: VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models☆24Jun 16, 2025Updated 7 months ago
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆55Aug 16, 2024Updated last year
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆138Mar 16, 2023Updated 2 years ago
- [NeurIPS2024] Official code for (IMA) Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs☆23Oct 15, 2024Updated last year
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆32Mar 26, 2025Updated 10 months ago
- Benchmarking and Analyzing Generative Data for Visual Recognition☆26Jul 25, 2023Updated 2 years ago
- Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning☆45Jul 2, 2025Updated 7 months ago
- This repository houses the code for the paper - "The Neglected of VLMs"☆30Dec 31, 2025Updated last month
- Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022☆31May 29, 2023Updated 2 years ago
- SVHF-Net for Cross-modal binary matching☆32Aug 22, 2018Updated 7 years ago
- [TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation☆31Sep 6, 2024Updated last year
- [CVPR 2024] Improving language-visual pretraining efficiency by perform cluster-based masking on images.☆31May 16, 2024Updated last year
- ☆138Sep 29, 2024Updated last year
- [CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale☆213Feb 27, 2024Updated last year
- Belief Revision based Caption Re-ranker with Visual Semantic Information. COLING 2022☆11Apr 13, 2025Updated 10 months ago
- ☆33Apr 11, 2025Updated 10 months ago
- [CVPR 2020] A generative model with latent factors that are independent and localized.☆12Mar 27, 2025Updated 10 months ago
- [Nature Communications] O2VAE: a model for orientation-invariant representation learning (phenotyping) in cell biology data☆38Mar 26, 2025Updated 10 months ago
- Visual Representation Learning Benchmark for Self-Supervised Models☆35Apr 18, 2024Updated last year
- m&ms: A Benchmark to Evaluate Tool-Use for multi-step multi-modal tasks☆44Sep 26, 2024Updated last year
- Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data☆45Feb 18, 2025Updated 11 months ago
- Code for EMNLP 2021 main conference paper "Dynamic Knowledge Distillation for Pre-trained Language Models"☆41Aug 9, 2022Updated 3 years ago
- Pytorch code for "Improving Self-Supervised Learning by Characterizing Idealized Representations"☆41Nov 27, 2022Updated 3 years ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆86Mar 21, 2024Updated last year
- Official Pytorch code for Open World Object Detection in the Era of Foundation Models☆92Jan 26, 2024Updated 2 years ago