Qinying-Liu / TagAlign
Official implementation of TagAlign
☆31Updated 5 months ago
Related projects: ⓘ
- Simple PyTorch implementation of "Libra: Building Decoupled Vision System on Large Language Models" (accepted by ICML 2024)☆41Updated 3 months ago
- ☆17Updated this week
- Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆35Updated last month
- ☆12Updated 9 months ago
- ☆52Updated last year
- ☆20Updated last year
- ☆16Updated last year
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆35Updated last year
- Turning to Video for Transcript Sorting☆44Updated last year
- ☆15Updated 4 months ago
- [ECCV2024] Learning Video Context as Interleaved Multimodal Sequences☆17Updated 3 weeks ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆62Updated 4 months ago
- ☆20Updated 9 months ago
- Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)☆40Updated last year
- ☆83Updated 9 months ago
- [ECCV 2024] ControlCap: Controllable Region-level Captioning☆49Updated last month
- Towards a Unified View on Visual Parameter-Efficient Transfer Learning☆26Updated last year
- ☆27Updated 5 months ago
- ☆35Updated last year
- Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment☆44Updated 3 months ago
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆41Updated 4 months ago
- FreeVA: Offline MLLM as Training-Free Video Assistant☆42Updated 3 months ago
- [CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.☆28Updated last year
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆36Updated last year
- ☆17Updated 5 months ago
- (ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation☆43Updated 2 months ago
- ☆31Updated 3 months ago
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆53Updated 10 months ago
- [AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation☆25Updated 7 months ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆85Updated 2 weeks ago