Yutong-Zhou-cv / Awesome-Survey-Papers
A curated list of Survey Papers on Deep Learning.
☆10Updated last year
Alternatives and similar repositories for Awesome-Survey-Papers:
Users that are interested in Awesome-Survey-Papers are comparing it to the libraries listed below
- Code for CVPR 2023 paper "SViTT: Temporal Learning of Sparse Video-Text Transformers"☆18Updated last year
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆18Updated this week
- Masked Vision-Language Transformer in Fashion☆33Updated last year
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆24Updated last year
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆35Updated last year
- [CVPR'22] Official PyTorch Implementation of "Collaborative Transformers for Grounded Situation Recognition"☆47Updated last year
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆35Updated 7 months ago
- ☆17Updated 9 months ago
- Official repository for the General Robust Image Task (GRIT) Benchmark☆50Updated last year
- ☆29Updated last year
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆55Updated last year
- A curated list of papers and resources for text-to-image evaluation.☆26Updated last year
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆35Updated 5 months ago
- LAVIS - A One-stop Library for Language-Vision Intelligence☆47Updated 5 months ago
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆54Updated last year
- Official Repository of Personalized Visual Instruct Tuning☆26Updated 2 months ago
- Code for paper: Unified Text-to-Image Generation and Retrieval☆13Updated 6 months ago
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆44Updated last year
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆24Updated 11 months ago
- Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆60Updated 5 months ago
- ☆55Updated 8 months ago
- [NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing☆15Updated last month
- Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".☆32Updated 4 months ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆37Updated last year
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆28Updated 3 months ago
- [ICME 2023] FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation☆11Updated last year
- ☆25Updated last year
- (ECCV 2024) Can OOD Object Detectors Learn from Foundation Models?☆23Updated last month