FoundationVision / GLEELinks
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
β1,153Updated 11 months ago
Alternatives and similar repositories for GLEE
Users that are interested in GLEE are comparing it to the libraries listed below
Sorting:
- [ECCV 2024] The official code of paper "Open-Vocabulary SAM".β1,007Updated last month
- π₯ Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videosβ1,279Updated 3 weeks ago
- [CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"β832Updated last month
- OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]β1,322Updated 4 months ago
- Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"β1,061Updated 4 months ago
- Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Seriesβ1,036Updated 8 months ago
- [CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perceptionβ593Updated last year
- [ECCV 2024] Tokenize Anything via Promptingβ595Updated 9 months ago
- [CVPR'23] Universal Instance Perception as Object Discovery and Retrievalβ1,277Updated 2 years ago
- Project Page for "LISA: Reasoning Segmentation via Large Language Model"β2,425Updated 7 months ago
- [CVPR 2024] Official implementation of the paper "Visual In-context Learning"β502Updated last year
- [ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenizationβ578Updated last year
- [ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"β732Updated last year
- [AAAI 2025] Official PyTorch implementation of "TinySAM: Pushing the Envelope for Efficient Segment Anything Model"β518Updated 8 months ago
- Official implementation of OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusionβ369Updated 6 months ago
- Personalize Segment Anything Model (SAM) with 1 shot in 10 secondsβ1,619Updated last year
- DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understandingβ1,233Updated 2 months ago
- (TPAMI 2024) A Survey on Open Vocabulary Learningβ952Updated 6 months ago
- Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anythingβ1,344Updated 5 months ago
- An efficient modular implementation of Associating Objects with Transformers for Video Object Segmentation in PyTorchβ573Updated last year
- EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anythingβ2,419Updated 9 months ago
- This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detectioβ¦β718Updated 2 months ago
- Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"β473Updated 6 months ago
- [ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"β2,733Updated 2 months ago
- VisionLLM Seriesβ1,108Updated 7 months ago
- [ICCV 2023] Tracking Anything with Decoupled Video Segmentationβ1,432Updated 5 months ago
- SAM-PT: Extending SAM to zero-shot video segmentation with point-based tracking.β1,018Updated last year
- Official repository for "AM-RADIO: Reduce All Domains Into One"β1,347Updated last week
- [ICCV 2025] SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Treeβ515Updated 2 months ago
- [ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergyβ2,580Updated last month