sirkosophia / DIPLinks
Official implementation of DIP: Unsupervised Dense In-Context Post-training of Visual Representations
☆45Updated last month
Alternatives and similar repositories for DIP
Users that are interested in DIP are comparing it to the libraries listed below
Sorting:
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆21Updated last year
- This is an implementation of the paper "Are We Done with Object-Centric Learning?"☆11Updated last month
- Distributed Optimization Infra for learning CLIP models☆27Updated last year
- [CVPR2025] Official code repository for SeTa: "Scale Efficient Training for Large Datasets"☆21Updated 7 months ago
- [CVPR 2024 Highlight] SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers☆69Updated last year
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion☆53Updated 4 months ago
- Test-Time Training on Video Streams☆64Updated 2 years ago
- CycleReward is a reward model trained on cycle consistency preferences to measure image-text alignment.☆50Updated last month
- This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long …☆93Updated last year
- ☆26Updated 6 months ago
- [ECCV’24] Official repository for "BEAF: Observing Before-AFter Changes to Evaluate Hallucination in Vision-language Models"☆20Updated 7 months ago
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆98Updated last year
- Official Release of NeurIPS 2023 Spotlight paper "Object-Centric Slot Diffusion"☆70Updated last year
- ☆105Updated 6 months ago
- MIMIC: Masked Image Modeling with Image Correspondences☆16Updated last year
- ☆33Updated 3 years ago
- Implementation of the proposed LVMAE, from the paper, Extending Video Masked Autoencoders to 128 frames, in Pytorch☆54Updated 11 months ago
- Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations ICCV23☆29Updated 10 months ago
- ☆39Updated 5 months ago
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆44Updated 11 months ago
- ☆37Updated 8 months ago
- [CVPR 2024] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities☆99Updated last year
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective☆72Updated last year
- [CVPR 2025] Parallel Sequence Modeling via Generalized Spatial Propagation Network☆106Updated 3 months ago
- PyTorch implementation of "Sample- and Parameter-Efficient Auto-Regressive Image Models" from CVPR 2025☆13Updated 7 months ago
- Towards training VQ-VAE models robustly!☆85Updated 3 months ago
- FleVRS: Towards Flexible Visual Relationship Segmentation, NeurIPS 2024☆22Updated 10 months ago
- CatMAE☆14Updated last year
- ☆10Updated last year
- Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496☆91Updated 6 months ago