facebookresearch / PUG
This is the repository for the Photorealistic Unreal Graphics (PUG) datasets for representation learning.
☆233Updated 11 months ago
Alternatives and similar repositories for PUG:
Users that are interested in PUG are comparing it to the libraries listed below
- ☆200Updated last year
- Internet Explorer explores the web in a self-supervised manner to progressively find relevant examples that improve performance on a desi…☆163Updated 2 years ago
- [ICCV2023] VLPart: Going Denser with Open-Vocabulary Part Segmentation☆371Updated last year
- ☆183Updated last year
- The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"☆242Updated 2 months ago
- PIPs++☆304Updated 8 months ago
- [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"☆312Updated 9 months ago
- LLaVA-Interactive-Demo☆367Updated 8 months ago
- This repo contains documentation and code needed to use PACO dataset: data loaders and training and evaluation scripts for objects, parts…☆276Updated last year
- Python Library to evaluate VLM models' robustness across diverse benchmarks☆195Updated last week
- Aim for the moon. If you miss, you may hit a star.☆163Updated 2 years ago
- [ICCV2023] Segment Every Reference Object in Spatial and Temporal Spaces☆237Updated last month
- Combining Segment Anything (SAM) with Grounded DINO for zero-shot object detection and CLIPSeg for zero-shot segmentation☆402Updated 10 months ago
- [NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"☆122Updated 7 months ago
- Grounded Segment Anything: From Objects to Parts☆403Updated last year
- Implementation of I-JEPA from "Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture"☆265Updated 2 months ago
- ☆167Updated 5 months ago
- Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗☆223Updated last month
- Implementation of Lumiere, SOTA text-to-video generation from Google Deepmind, in Pytorch☆269Updated 7 months ago
- Data release for the ImageInWords (IIW) paper.☆209Updated 4 months ago
- [CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Models☆275Updated 11 months ago
- Let's make a video clip☆93Updated 2 years ago
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".☆102Updated 9 months ago
- Huggingface-compatible SDXL Unet implementation that is readily hackable☆413Updated last year
- Code release for "Dropout Reduces Underfitting"☆312Updated last year
- ☆131Updated 2 years ago
- Object Recognition as Next Token Prediction (CVPR 2024 Highlight)☆174Updated 3 months ago
- Open reproduction of MUSE for fast text2image generation.☆347Updated 9 months ago
- Code release for "Improved baselines for vision-language pre-training"☆60Updated 10 months ago
- Simple large-scale training of stable diffusion with multi-node support.☆129Updated last year