JiuhaiChen / BLIP3oLinks
☆1,231Updated this week
Alternatives and similar repositories for BLIP3o
Users that are interested in BLIP3o are comparing it to the libraries listed below
Sorting:
- MMaDA - Open-Sourced Multimodal Large Diffusion Language Models☆1,136Updated 2 weeks ago
- Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving stat…☆1,265Updated last week
- An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL☆783Updated last week
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆601Updated 2 months ago
- [CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis☆1,341Updated this week
- 📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.☆588Updated this week
- HART: Efficient Visual Generation with Hybrid Autoregressive Transformer☆603Updated 8 months ago
- UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation☆566Updated this week
- SEED-Voken: A Series of Powerful Visual Tokenizers☆897Updated 4 months ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆535Updated last month
- A Unified Tokenizer for Visual Generation and Understanding☆340Updated last month
- Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation☆1,780Updated 10 months ago
- Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]☆577Updated last month
- [CVPR2024 Highlight] VBench - We Evaluate Video Generation☆1,053Updated 2 weeks ago
- Implementation for Describe Anything: Detailed Localized Image and Video Captioning☆1,179Updated last month
- [ICLR 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.☆1,530Updated this week
- Official implementation of UnifiedReward & UnifiedReward-Think☆429Updated last week
- ☆514Updated 7 months ago
- Scalable and memory-optimized training of diffusion models☆1,195Updated 3 weeks ago
- Next-Token Prediction is All You Need☆2,152Updated 3 months ago
- This repo contains the code for 1D tokenizer and generator☆918Updated 3 months ago
- Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’☆2,001Updated last month
- Official implementation of OneDiffusion paper (CVPR 2025)☆638Updated 6 months ago
- Multimodal Models in Real World☆517Updated 4 months ago
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"☆374Updated last week
- Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models☆703Updated 2 months ago
- ☆219Updated last month
- VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning☆251Updated 2 months ago
- ☆339Updated this week
- (CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models☆651Updated last month