Official implementation of Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (NeurIPS 2025)
☆45Nov 24, 2025Updated 3 months ago
Alternatives and similar repositories for Bifrost-1
Users that are interested in Bifrost-1 are comparing it to the libraries listed below
Sorting:
- Analyse and Design Deep Neural Network, Dr.Kalhor, University of Tehran☆11Feb 18, 2024Updated 2 years ago
- [ECCV 2024] Official PyTorch implementation of LUT "Learning with Unmasked Tokens Drives Stronger Vision Learners"☆13Dec 1, 2024Updated last year
- Code release for "Gaze-Assisted Medical Image Segmentation" [AIM-FM @ NeurIPS, 2024]☆14Oct 22, 2024Updated last year
- Deep Generative Models, University of Tehran, Dr.Tavassolipour☆17Feb 5, 2024Updated 2 years ago
- Transactions on Multimedia (TMM25)☆19Apr 8, 2025Updated 11 months ago
- official implementation of the paper "Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability".☆48Dec 25, 2025Updated 2 months ago
- Official PyTorch implementation of our paper "Spherical Vision Transformer for 360° Video Saliency Prediction" (BMVC 2023)☆22Mar 27, 2024Updated last year
- ☆22Jun 17, 2025Updated 8 months ago
- TPDiff: Temporal Pyramid Video Diffusion Model☆25Mar 13, 2025Updated 11 months ago
- Official implementation of "Where and How to Perturb: On the Design of Perturbation Guidance in Diffusion and Flow Models"☆35Nov 30, 2025Updated 3 months ago
- Official Implementation for Generative Neural Fields by Mixtures of Neural Implicit Functions☆19Mar 10, 2024Updated last year
- Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models (ICLR 2026)☆42Updated this week
- UICrit is a dataset containing human-generated natural language design critiques, corresponding bounding boxes for each critique, and des…☆26Nov 19, 2024Updated last year
- MICCAI 2024: Tri-Plane Mamba: Efficiently Adapting Segment Anything Model for 3D Medical Images☆26Apr 3, 2025Updated 11 months ago
- [ICCV 2025 Oral] CorrCLIP: Reconstructing Patch Correlations in CLIP for Open-Vocabulary Semantic Segmentation☆61Aug 1, 2025Updated 7 months ago
- Locally Hierarchical Auto-Regressive Modeling for Image Generation (HQ-Transformer)☆28Feb 14, 2024Updated 2 years ago
- [ICCV2025] VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation☆33Aug 18, 2025Updated 6 months ago
- Turbo3D: Ultra-fast Text-to-3D Generation☆76Dec 7, 2024Updated last year
- Official Implementation of LaViDa: :A Large Diffusion Language Model for Multimodal Understanding☆197Dec 17, 2025Updated 2 months ago
- Tutorial on using Hugging Face's Vision Transformers for Image Classification☆10Sep 4, 2021Updated 4 years ago
- ☆18Sep 23, 2025Updated 5 months ago
- Official code for the paper: Can3Tok (ICCV2025)☆39Aug 23, 2025Updated 6 months ago
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning☆234Jan 22, 2026Updated last month
- ☆32Feb 29, 2024Updated 2 years ago
- [NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding☆513Nov 14, 2025Updated 3 months ago
- Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning☆241Feb 10, 2026Updated 3 weeks ago
- [CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?☆35Apr 27, 2023Updated 2 years ago
- [NeurIPS 2024]Repos for "Visualization-of-Thought" dataset, construction code and evaluation.☆36Oct 23, 2024Updated last year
- [ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models☆39Jun 14, 2025Updated 8 months ago
- Official implementation of USR (NeurIPS 2024)☆39Dec 21, 2024Updated last year
- UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture☆97Feb 5, 2026Updated last month
- [CVPR 2026] A training-free, mask-free framework for 3D shape editing.☆25Dec 12, 2025Updated 2 months ago
- A Strong Class-Agnostic Tracker for LiDAR Point Clouds☆20Jul 12, 2025Updated 7 months ago
- [AAAI2026] Bring Your Dreams to Life: Continual Text-to-Video Customization☆36Dec 9, 2025Updated 3 months ago
- Tutorial for Graph Neural Network at APBJC 2024.☆10Apr 21, 2025Updated 10 months ago
- [TIP2025] The implementation of "Uncertainty Guided Refinement for Fine-grained Salient Object Detection"☆17Apr 20, 2025Updated 10 months ago
- Pytorch implementation of Self-Refining Video Sampling☆146Feb 6, 2026Updated last month
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.☆48Jul 17, 2025Updated 7 months ago
- [CVPR 2026] Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"☆84Feb 13, 2026Updated 3 weeks ago