adobe-research / ImageFolder
☆16Updated last month
Alternatives and similar repositories for ImageFolder:
Users that are interested in ImageFolder are comparing it to the libraries listed below
- CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient☆75Updated last week
- Official repository of paper "Subobject-level Image Tokenization"☆64Updated 9 months ago
- Codebase for the paper-Elucidating the design space of language models for image generation☆45Updated 2 months ago
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆54Updated last year
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆97Updated 8 months ago
- Official code for paper: Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language☆22Updated 2 weeks ago
- 🔥stable, simple, state-of-the-art VQVAE toolkit & cookbook☆74Updated 7 months ago
- Official Implementation of ICLR'24: Kosmos-G: Generating Images in Context with Multimodal Large Language Models☆61Updated 8 months ago
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective☆59Updated 3 months ago
- Liquid: Language Models are Scalable Multi-modal Generators☆61Updated last month
- ☆133Updated last month
- Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆61Updated 5 months ago
- [NeurIPS 2024] Efficient Multi-modal Models via Stage-wise Visual Context Compression☆51Updated 5 months ago
- Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.☆90Updated 10 months ago
- XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation☆182Updated last week
- “FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching” FlowAR employs a simplest scale design and is compatible with an…☆84Updated last month
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆81Updated 3 months ago
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025).☆47Updated last week
- Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".☆57Updated last year
- Open implementation of "RandAR"☆51Updated 2 weeks ago
- Codes accompanying the paper "Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment"☆23Updated 2 months ago
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆35Updated 7 months ago
- This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality"☆45Updated 2 weeks ago
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆95Updated last month
- Adapting LLaMA Decoder to Vision Transformer☆26Updated 8 months ago
- ☆98Updated this week
- Official Implementation of the CrossMAE paper: Rethinking Patch Dependence for Masked Autoencoders☆99Updated last month
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆31Updated 10 months ago
- [ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark☆66Updated last week
- Implementation of the paper "MaskBit: Embedding-free Image Generation from Bit Tokens"☆41Updated last month