innovator-zero / SAKLinks
[ICLR2025] Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
☆14Updated 9 months ago
Alternatives and similar repositories for SAK
Users that are interested in SAK are comparing it to the libraries listed below
Sorting:
- Official PyTorch implementation of "Extract Free Dense Labels from CLIP" (ECCV 22 Oral)☆471Updated 3 years ago
- ☆88Updated 5 months ago
- 😎 up-to-date & curated list of awesome 3D Visual Grounding papers, methods & resources.☆260Updated 2 weeks ago
- [NeurIPS 2023] Self-supervised Object-Centric Learning for Videos☆32Updated last year
- [NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"☆309Updated last year
- Official Implementation of Diffusion Step Annealing (DiSA) in Autoregressive Image Generation☆144Updated 8 months ago
- Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised Transformers☆15Updated last year
- Code for Scaling Language-Free Visual Representation Learning (WebSSL)☆245Updated 9 months ago
- ☆35Updated last month
- [NeurIPS 2025] Panoptic Captioning: An Equivalence Bridge for Image and Text☆33Updated last month
- ☆63Updated 2 years ago
- [ICLR'23 Oral] Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching☆255Updated 2 years ago
- [T-PAMI 2024] & [CVPR 2023] Vote2Cap-DETR; A set-to-set perspective towards 3D Dense Captioning; State-of-the-Art 3D Dense Captioning met…☆104Updated last year
- [ICLR 2023] SQA3D for embodied scene understanding and reasoning☆154Updated 2 years ago
- ☆64Updated last year
- [ECCV 2024 (Oral)] Towards Scene Graph Anticipation☆18Updated last year
- A curated list of awesome self-supervised learning methods in videos☆165Updated last month
- [ICML 2023] MTPD: Learning Lightweight Object Detectors via Multi-Teacher Progressive Distillation☆15Updated 2 years ago
- [NeurIPS 2025] 3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding☆144Updated last month
- [ECCV 2024 Best Paper Candidate] Implementation of "Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Vi…☆94Updated 6 months ago
- An Examination of the Compositionality of Large Generative Vision-Language Models☆19Updated last year
- Official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]☆63Updated last year
- [ICLR 2023 - UNOFFICIAL] Bridging the Gap to Real-World Object-Centric Learning☆23Updated last year
- ☆46Updated last year
- Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future☆215Updated 9 months ago
- Code for the ECCV22 paper "Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds"☆94Updated 2 years ago
- ☆53Updated last year
- ☆44Updated 2 years ago
- ☆12Updated last year
- [ECCV2024 Oral🔥] Official Implementation of "GiT: Towards Generalist Vision Transformer through Universal Language Interface"☆358Updated last year