UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation
☆46Aug 26, 2025Updated 6 months ago
Alternatives and similar repositories for UniFork
Users that are interested in UniFork are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations☆201Sep 18, 2025Updated 5 months ago
- Official PyTorch implementation of The Linear Attention Resurrection in Vision Transformer☆16Sep 7, 2024Updated last year
- Official implementation of Forge4D: Feed-Forward 4D Human Reconstruction and Interpolation from Uncalibrated Sparse Videos☆40Sep 30, 2025Updated 5 months ago
- CVPR 2025 Accepted Papers☆23Dec 20, 2025Updated 2 months ago
- Consistent Autoregressive Video Generation with Long Context☆67Feb 6, 2026Updated 3 weeks ago
- ☆15Sep 22, 2025Updated 5 months ago
- [NeurIPS 2025] The official PyTorch implementation of the "Vision Function Layer in MLLM".☆28Dec 18, 2025Updated 2 months ago
- [AAAI 2026] Zero-to-Hero: Zero-Shot Initialization Empowering Reference-Based Video Appearance Editing☆25Nov 20, 2025Updated 3 months ago
- Code implementation for: From Virtual Games to Real-World Play☆46Jun 23, 2025Updated 8 months ago
- Code for paper: Freeplane: Unlocking Free Lunch in Triplane-Based Sparse-View Reconstruction Models☆18Jun 6, 2024Updated last year
- [ICCV 2025] TokensGen: Harnessing Condensed Tokens for Long Video Generation☆56Dec 10, 2025Updated 2 months ago
- Collection of peptide de novo sequencing algorithms by BEAM labs☆30Dec 6, 2025Updated 2 months ago
- the official repo for "D-AR: Diffusion via Autoregressive Models"☆135Jan 29, 2026Updated last month
- [ICCV 2025] Official Implementation of RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model for Referring …☆18Jun 27, 2025Updated 8 months ago
- ☆51Nov 8, 2025Updated 3 months ago
- [ICCV 2025] Official Implementation of "Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation". Junyu Xie, Tengda H…☆21Jul 26, 2025Updated 7 months ago
- Official Pytorch Implementation of Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model [ECCV'24]☆22Dec 24, 2024Updated last year
- Official Implementation for "Editing Massive Concepts in Text-to-Image Diffusion Models"☆19Mar 21, 2024Updated last year
- ☆17Feb 20, 2025Updated last year
- ☆132Jun 24, 2025Updated 8 months ago
- Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning☆237May 30, 2025Updated 9 months ago
- DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection☆21Oct 5, 2023Updated 2 years ago
- [CVPR 2025 Highlight] MeshGen: Generating PBR Textured Mesh with Render-Enhanced Auto-Encoder and Generative Data Augmentation☆64May 9, 2025Updated 9 months ago
- [NeurIPS 2024] DEMO: Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning☆47Nov 1, 2024Updated last year
- Official implementation for "pOps: Photo-Inspired Diffusion Operators"☆85Jul 23, 2024Updated last year
- [ICCV 25]SpectralAR: Spectral Autoregressive Visual Generation☆35Jun 13, 2025Updated 8 months ago
- ☆43May 10, 2025Updated 9 months ago
- This repository contains the code for the paper - "Aligning Text, Images, and 3D Structure Token-by-Token" (CVPR 2026)☆44Jun 11, 2025Updated 8 months ago
- ☆27Mar 3, 2025Updated last year
- ☆24Dec 23, 2024Updated last year
- ☆23Mar 15, 2024Updated last year
- MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision☆27May 26, 2025Updated 9 months ago
- Awesome Unified Multimodal Models☆1,125Feb 6, 2026Updated 3 weeks ago
- [ICCV 2025] The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"☆59Apr 5, 2025Updated 10 months ago
- [ICML'25] The PyTorch implementation of paper: "AdaWorld: Learning Adaptable World Models with Latent Actions".☆206Jun 17, 2025Updated 8 months ago
- MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction☆142Apr 27, 2024Updated last year
- [CVPR 2025] GPS as a Control Signal for Image Generation☆25Mar 18, 2025Updated 11 months ago
- [ICCV 2025] Official implementation of "Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing"☆28Apr 15, 2025Updated 10 months ago
- Awesome autoregressive vision foundation models☆26Dec 24, 2024Updated last year