[CVPR25] IAR
☆17Jun 13, 2025Updated 8 months ago
Alternatives and similar repositories for IAR
Users that are interested in IAR are comparing it to the libraries listed below
Sorting:
- ☆39Jul 27, 2024Updated last year
- [ACM MM24] MotionMaster: Training-free Camera Motion Transfer For Video Generation☆99Oct 15, 2024Updated last year
- [ACL 2025] The official pytorch implement of "MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection".☆26May 26, 2025Updated 9 months ago
- ☆34Dec 29, 2025Updated 2 months ago
- This is a python library. Install with "python3 -m pip install rp" then run with "python3 -m rp" or just "rp". Requires python≥3.5☆13Feb 16, 2026Updated 2 weeks ago
- A framework for steering MoE models by detecting and controlling behavior-linked experts.☆29Sep 12, 2025Updated 5 months ago
- ☆22Dec 11, 2025Updated 2 months ago
- [ICCV2025] The official code of "DreamRelation: Relation-Centric Video Customization"☆27Feb 4, 2026Updated 3 weeks ago
- Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO☆92Dec 1, 2025Updated 3 months ago
- ☆52Jan 15, 2026Updated last month
- ☆47Apr 20, 2025Updated 10 months ago
- Exposing Text-Image Inconsistency Using Diffusion Models (ICLR 2024)☆10Jun 15, 2024Updated last year
- Official implementation of the paper "M3CoTBench: Benchmark Chain-of-Thought of MLLMs in Medical Image Understanding"☆21Jan 14, 2026Updated last month
- [MM 2023] Stroke-based Neural Painting and Stylization with Dynamically Predicted Painting Region☆48Oct 16, 2023Updated 2 years ago
- A toy text-to-image model trained from scratch.☆19Jun 9, 2025Updated 8 months ago
- [ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"☆38Oct 9, 2025Updated 4 months ago
- [npj Digital Medicine] A multimodal multidomain multilingual medical foundation model for zero shot clinical diagnosis☆17Feb 6, 2025Updated last year
- [ICLR 2025] This repo is the official implementation of "The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs".☆13Jan 25, 2025Updated last year
- 2D Gaussian splatting for image compression☆17Nov 29, 2024Updated last year
- [ICLR 2025] Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception☆14Jul 4, 2025Updated 7 months ago
- Official implementation of "Towards One-Step Causal Video Generation via Adversarial Self-Distillation" (arXiv 2025). A novel framework f…☆25Nov 4, 2025Updated 3 months ago
- A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but …☆13Dec 1, 2022Updated 3 years ago
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"☆19Jan 18, 2026Updated last month
- ☆14May 26, 2025Updated 9 months ago
- Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Finding]"☆15Aug 27, 2025Updated 6 months ago
- ☆36Jan 13, 2026Updated last month
- ☆13Jul 10, 2024Updated last year
- This repository contains the code for CVPRW 2024 paper: Generating Material-Aware 3D Models from Sparse Views☆13Jun 11, 2024Updated last year
- ☆11Oct 2, 2024Updated last year
- ☆10Nov 27, 2024Updated last year
- [ICLR 2026] SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs☆44Oct 14, 2025Updated 4 months ago
- [CVPR '26] SceneTok: A Compressed, Diffusable Token Space for 3D Scenes☆59Feb 24, 2026Updated last week
- ☆11Sep 7, 2020Updated 5 years ago
- Official repository for WWW'24 paper "MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation"☆12Jul 25, 2024Updated last year
- [CVPR 2024] SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis☆63Dec 18, 2024Updated last year
- [T-PAMI 2025] EMOv2: Pushing 5M Vision Model Frontier☆54Dec 30, 2024Updated last year
- [MM 2023] Toward High Quality Facial Representation Learning☆19Oct 30, 2023Updated 2 years ago
- This is a project based on an accepted paper "Weighted Poisson-disk Resampling on Large-Scale Point Clouds"☆16Dec 19, 2024Updated last year
- [CVPR2025] Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models☆19Apr 30, 2025Updated 10 months ago