[ICLR 2025] Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
☆14Jul 4, 2025Updated 8 months ago
Alternatives and similar repositories for ADDP
Users that are interested in ADDP are comparing it to the libraries listed below
Sorting:
- MR. Video: MapReduce is the Principle for Long Video Understanding☆30Apr 23, 2025Updated 10 months ago
- Exposing Text-Image Inconsistency Using Diffusion Models (ICLR 2024)☆10Jun 15, 2024Updated last year
- [CVPR2025] The code for "Uncertainty-Instructed Structure Injection for Generalizable HD Map Construction."☆21Oct 19, 2025Updated 4 months ago
- ☆13Jul 10, 2024Updated last year
- ☆15Mar 30, 2025Updated 11 months ago
- [IROS 2023] "Streaming Motion Forecasting for Autonomous Driving"☆41Oct 2, 2023Updated 2 years ago
- (TPAMI'2024) ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation☆22Aug 8, 2024Updated last year
- ☆22Aug 3, 2024Updated last year
- (CVPR 24) HIMap: HybrId Representation Learning for End-to-end Vectorized HD Map Construction☆22Jun 4, 2024Updated last year
- [ECCV-24] This is the official implementation of the paper "SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation".☆27Oct 13, 2024Updated last year
- Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation☆23Jul 30, 2025Updated 7 months ago
- ☆27Mar 3, 2025Updated last year
- [CVPR 2025] GPS as a Control Signal for Image Generation☆25Mar 18, 2025Updated 11 months ago
- [WACV 2024] This is the official implementation of BEVMap, a map-aware BEV modeling framework for multiview-camera detection☆22Dec 28, 2023Updated 2 years ago
- official repository of CVPR 2024 paper, RMem: Restricted Memory Banks Improve Video Object Segmentation☆53Jan 31, 2025Updated last year
- StreamPETR with 3dppe Extension☆51Jan 16, 2024Updated 2 years ago
- Video Diffusion Transformers are In-Context Learners☆35Jan 6, 2025Updated last year
- [WACV 2025] PrevPredMap: Exploring Temporal Modeling with Previous Predictions for Online Vectorized HD Map Construction☆24Oct 15, 2024Updated last year
- Code for Paper 'Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach'☆35Jan 2, 2026Updated 2 months ago
- Reward Guided Latent Consistency Distillation☆26Oct 9, 2024Updated last year
- [CVPR 2025 (Oral)] Open implementation of "RandAR"☆207Jul 14, 2025Updated 7 months ago
- [ECCV 2024] RecurrentBEV: A Long-term Temporal Fusion Framework for Multi-view 3D Detection☆32Sep 28, 2024Updated last year
- T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation (ICCV'25)☆44Oct 6, 2025Updated 4 months ago
- TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models☆37Nov 10, 2024Updated last year
- [ACL 2025] The official pytorch implement of "MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection".☆26May 26, 2025Updated 9 months ago
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆35Mar 12, 2024Updated last year
- [NeurIPS 2025] VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning☆64Jan 6, 2026Updated last month
- The official implementation of the ECCV 2024 paper: Continuity Preserving Online CenterLine Graph Learning☆34Dec 16, 2024Updated last year
- ☆12Apr 1, 2025Updated 11 months ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Is Your HD Map Constructor Reliable under Sensor Corruptions?☆37Aug 18, 2024Updated last year
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆39Mar 4, 2024Updated 2 years ago
- Implementation of PF-Track☆250Jul 28, 2023Updated 2 years ago
- ☆43May 30, 2025Updated 9 months ago
- Codebase for the paper-Elucidating the design space of language models for image generation☆46Nov 17, 2024Updated last year
- A framework for steering MoE models by detecting and controlling behavior-linked experts.☆29Sep 12, 2025Updated 5 months ago
- Official PyTorch implementation of The Linear Attention Resurrection in Vision Transformer☆16Sep 7, 2024Updated last year
- [ICLR2026] The code for "Interp3D: Correspondence-Aware Interpolation for Generative Textured 3D Morphing."☆24Jan 21, 2026Updated last month
- ☆10Apr 7, 2025Updated 10 months ago