NVlabs / PS3
Scaling Vision Pre-Training to 4K Resolution
☆60Updated this week
Alternatives and similar repositories for PS3:
Users that are interested in PS3 are comparing it to the libraries listed below
- ☆68Updated 4 months ago
- Autoregressive Image Generation with Randomized Parallel Decoding☆35Updated this week
- ☆47Updated last month
- [CVPR2025] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-project/☆129Updated last week
- [NeurIPS 2024] Official PyTorch Implementation of "FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner"☆66Updated 5 months ago
- [CVPR2025] PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Noise-Free Framework for Cross-Mo…☆153Updated 2 weeks ago
- [ECCV 2024] Code for "EraseDraw: Learning to Insert Objects by Erasing Them from Images"☆24Updated 3 months ago
- Navigate dreamscapes with a click – your chosen point guides the drone’s flight in a thrilling visual journey.☆45Updated last year
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆63Updated 3 weeks ago
- Official PyTorch implementation - Video Motion Transfer with Diffusion Transformers☆40Updated this week
- [ NeurIPS 2024 D&B Track ] Implementation for "FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models"☆67Updated 3 months ago
- Distilling Diversity and Control in Diffusion Models☆33Updated this week
- The official implementation of the paper "ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations".☆35Updated 2 months ago
- Vico: Compositional Video Generation as Flow Equalization☆58Updated 4 months ago
- [CVPR 2024] Official PyTorch implementation of "ECLIPSE: Revisiting the Text-to-Image Prior for Efficient Image Generation"☆62Updated 10 months ago
- Code for the paper "Benchmarking Object Detectors with COCO: A New Path Forward."☆26Updated 8 months ago
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆65Updated last month
- Code for the paper "Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers"☆54Updated last week
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆69Updated 3 months ago
- ☆191Updated last month
- Official implemention of "Make It Count: Text-to-Image Generation with an Accurate Number of Objects" (CVPR 2025)☆69Updated 2 weeks ago
- Diffusion Models as Data Mining Tools☆53Updated 3 weeks ago
- Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?☆111Updated last month
- Implementation of MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion Path☆68Updated last year
- ☆27Updated 3 weeks ago
- Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening☆52Updated last month
- code for "TVG: A Training-free Transition Video Generation Method with Diffusion Models"☆41Updated 7 months ago
- Official implementation of "Reangle-A-Video: 4D Video Generation as Video-to-Video Translation"☆33Updated 2 weeks ago
- ☆65Updated last year
- Implementation of Zero-Shot Video Semantic Segmentation [CVPR 2025]☆44Updated last month