PKU-ML / adainf
Official code for ICLR 2024 paper "Do Generated Data Always Help Contrastive Learning?"
☆30Updated 11 months ago
Alternatives and similar repositories for adainf:
Users that are interested in adainf are comparing it to the libraries listed below
- [CVPR 2024 Highlight] ImageNet-D☆41Updated 4 months ago
- Official Repository of Personalized Visual Instruct Tuning☆27Updated this week
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆33Updated last week
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆36Updated 3 months ago
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆35Updated 8 months ago
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…☆38Updated 2 months ago
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆34Updated 8 months ago
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆69Updated 7 months ago
- Replication in Visual Diffusion Models: A Survey and Outlook☆28Updated 7 months ago
- Official PyTorch Code for "Is Synthetic Data From Diffusion Models Ready for Knowledge Distillation?" (https://arxiv.org/abs/2305.12954)☆46Updated last year
- [NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models☆48Updated 3 months ago
- OVMR: Open-Vocabulary Recognition with Multi-Modal References (CVPR24)☆25Updated 3 months ago
- OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding? [CVPR 2025]☆34Updated last week
- Official implementation of LaVin-DiT☆24Updated last month
- [NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing☆18Updated 3 months ago
- ☆38Updated 2 months ago
- [CVPR'25] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection☆56Updated last week
- Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)☆21Updated last week
- FreeVA: Offline MLLM as Training-Free Video Assistant☆57Updated 9 months ago
- Official implement of MIA-DPO☆52Updated last month
- ☆28Updated 7 months ago
- A paper list about Token Merge, Reduce, Resample, Drop for MLLMs.☆37Updated last month
- Adapting LLaMA Decoder to Vision Transformer☆26Updated 9 months ago
- [ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models☆72Updated 5 months ago
- Official code for paper: Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language☆23Updated last week