KKallidromitis / SA-1B-Downloader
Simple script to parallelize download and extract files for SA-1B Dataset.
☆30Updated last month
Related projects ⓘ
Alternatives and complementary repositories for SA-1B-Downloader
- This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model☆90Updated 4 months ago
- ☆57Updated last year
- ☆19Updated 11 months ago
- ☆52Updated last year
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆26Updated 5 months ago
- Official Implementation of ICCV 2023 Paper - SegPrompt: Boosting Open-World Segmentation via Category-level Prompt Learning☆110Updated 3 months ago
- VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation☆84Updated 2 months ago
- 🔥 Aurora Series: A more efficient multimodal large language model series for video.☆47Updated last week
- OpenMMLab Detection Toolbox and Benchmark for V3Det☆15Updated 7 months ago
- ☆101Updated 5 months ago
- DiverGen (CVPR 2024) & BSGAL (ICML 2024)☆36Updated 2 weeks ago
- [NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"☆112Updated 3 months ago
- IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆25Updated last month
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆120Updated last month
- [NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training☆135Updated 2 weeks ago
- Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆51Updated 3 months ago
- ☆109Updated 5 months ago
- [NeurIPS 2024] Efficient Multi-modal Models via Stage-wise Visual Context Compression☆41Updated 3 months ago
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆23Updated 10 months ago
- [CVPR 2024 Highlight] ImageNet-D☆38Updated last month
- (ICLR 2024, CVPR 2024) SparseFormer☆63Updated 2 weeks ago
- ☆58Updated last year
- The official code of the paper "PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction".☆45Updated 3 weeks ago
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆98Updated 6 months ago
- [CVPR24] Official Implementation of GEM (Grounding Everything Module)☆86Updated last month
- ☆29Updated 7 months ago
- 【ECCV2024】The official repo of Griffon series☆106Updated 2 weeks ago
- ☆18Updated last year
- ☆21Updated last year