awaisrauf / Awesome-CV-Foundational-ModelsView external linksLinks
☆547Nov 7, 2024Updated last year
Alternatives and similar repositories for Awesome-CV-Foundational-Models
Users that are interested in Awesome-CV-Foundational-Models are comparing it to the libraries listed below
Sorting:
- A curated list of foundation models for vision and language tasks☆1,140Jun 23, 2025Updated 7 months ago
- (TPAMI 2024) A Survey on Open Vocabulary Learning☆987Dec 24, 2025Updated last month
- [CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses tha…☆945Aug 5, 2025Updated 6 months ago
- [ICCV'23 Main Track, WECIA'23 Oral] Official repository of paper titled "Self-regulating Prompts: Foundational Model Adaptation without F…☆284Sep 28, 2023Updated 2 years ago
- A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''☆1,354Mar 14, 2024Updated last year
- Recent LLM-based CV and related works. Welcome to comment/contribute!☆873Mar 8, 2025Updated 11 months ago
- [CVPRW 2025] Official repository of paper titled "Towards Evaluating the Robustness of Visual State Space Models"☆26Jun 8, 2025Updated 8 months ago
- [MICCAI 2025] Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology☆12Jun 17, 2025Updated 8 months ago
- Official implementation for the paper "Prompt Pre-Training with Over Twenty-Thousand Classes for Open-Vocabulary Visual Recognition"☆259May 3, 2024Updated last year
- [ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of …☆504Aug 9, 2024Updated last year
- EVA Series: Visual Representation Fantasies from BAAI☆2,648Aug 1, 2024Updated last year
- Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).☆1,232Jun 28, 2024Updated last year
- PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models☆261Aug 5, 2025Updated 6 months ago
- This repository is for the first comprehensive survey on Meta AI's Segment Anything Model (SAM).☆1,211Updated this week
- Project Page for "LISA: Reasoning Segmentation via Large Language Model"☆2,585Feb 16, 2025Updated last year
- ☆92Nov 25, 2023Updated 2 years ago
- [CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language☆1,342Oct 5, 2023Updated 2 years ago
- [MICCAI 2023] Official code repository of paper titled "Frequency Domain Adversarial Training for Robust Volumetric Medical Segmentation"…☆52Nov 14, 2023Updated 2 years ago
- [ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"☆2,808Jul 10, 2025Updated 7 months ago
- A curated list of prompt-based paper in computer vision and vision-language learning.☆928Dec 18, 2023Updated 2 years ago
- Official implementation and data release of the paper "Visual Prompting via Image Inpainting".☆318Aug 7, 2023Updated 2 years ago
- [T-PAMI-2024] Transformer-Based Visual Segmentation: A Survey☆759Aug 25, 2024Updated last year
- Collect some papers about transformer for detection and segmentation. Awesome Detection Transformer for Computer Vision (CV)☆1,394Jul 4, 2024Updated last year
- Tracking and collecting papers/projects/others related to Segment Anything.☆1,684Mar 13, 2025Updated 11 months ago
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆102Sep 11, 2024Updated last year
- [ACCV 2024] ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes 🚀🚀🚀☆37Jan 21, 2025Updated last year
- Grounded Language-Image Pre-training☆2,573Jan 24, 2024Updated 2 years ago
- NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024☆1,812Nov 27, 2025Updated 2 months ago
- General AI methods for Anything: AnyObject, AnyGeneration, AnyModel, AnyTask, AnyX☆1,842Nov 15, 2023Updated 2 years ago
- [ECCVW 2024 -- ORAL] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".☆12Oct 11, 2024Updated last year
- [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"☆320Jun 3, 2024Updated last year
- Latest Advances on Multimodal Large Language Models☆17,337Feb 7, 2026Updated last week
- Emu Series: Generative Multimodal Models from BAAI☆1,765Jan 12, 2026Updated last month
- Collection of AWESOME vision-language models for vision tasks☆3,081Oct 14, 2025Updated 4 months ago
- [NeurIPS2023] 3D-OWIS is capable of detecting unknown instances in inference, and progressively learning novel classes in the process of …☆68Dec 3, 2023Updated 2 years ago
- ☆360Jan 27, 2024Updated 2 years ago
- An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites☆5,011Jul 30, 2024Updated last year
- Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)☆2,173May 20, 2024Updated last year
- [CVPR 2024] Official implementation of the paper "Visual In-context Learning"☆529Apr 8, 2024Updated last year