zhengdian1/AIA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zhengdian1/AIA)

zhengdian1 / AIA

☆45

Alternatives and similar repositories for AIA

Users that are interested in AIA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Vchitect / Uni-MMMU
View on GitHub
[ACL2026 oral] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark
☆26Apr 13, 2026Updated 3 months ago
iSEE-Laboratory / DIF-of-Bimanual-Robotic-Manipulation
View on GitHub
(ICCV 2025) Official repository of paper "Rethinking Bimanual Robotic Manipulation: Learning with Decoupled Interaction Framework
☆15Oct 15, 2025Updated 9 months ago
iSEE-Laboratory / PanoDecouple
View on GitHub
(CVPR2025 Highlight) Official repository of paper "Panorama Generation From NFoV Image Done Right"
☆19May 29, 2025Updated last year
Visual-AI / Pancap
View on GitHub
[NeurIPS 2025] Panoptic Captioning: An Equivalence Bridge for Image and Text
☆38Jan 31, 2026Updated 5 months ago
wendell0218 / Janus-Pro-R1
View on GitHub
[NeurIPS 2025] Official repository of the paper "Unlocking Aha Moments via Reinforcement Learning: Advancing Collaborative Visual Compreh…
☆23Sep 27, 2025Updated 10 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
LAW1223 / OpenSubject
View on GitHub
☆55Dec 10, 2025Updated 7 months ago
iSEE-Laboratory / CycleManip
View on GitHub
[CVPR2026] Official repository of paper "CycleManip: Enabling Cyclic Task Manipulation via Effective Historical Perception and Understand…
☆25Feb 21, 2026Updated 5 months ago
HorizonWind2004 / reconstruction-alignment
View on GitHub
[ICLR 2026] Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potenti…
☆411May 23, 2026Updated 2 months ago
appletea233 / EditThinker
View on GitHub
Unlocking Iterative Reasoning for Any Image Editor
☆112Jan 18, 2026Updated 6 months ago
Osilly / Interleaving-Reasoning-Generation
View on GitHub
[ICLR 2026] This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA bench…
☆100Jan 26, 2026Updated 6 months ago
HumanMLLM / IRG-MotionLLM
View on GitHub
(ECCV2026) Official repository of paper "IRG-MotionLLM: Interleaving Motion Generation, Assessment and Refinement for Text-to-Motion Gene…
☆30Jul 1, 2026Updated 3 weeks ago
AIFrontierLab / UniGame
View on GitHub
[CVPR'26] UniGame code implementation
☆20Apr 21, 2026Updated 3 months ago
Jacky-hate / HiAR
View on GitHub
[ECCV 2026] sink-free, anti-drift causal AR Video Generation
☆153Updated this week
showlab / UniRL
View on GitHub
The code repository of UniRL
☆53May 30, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
WayneJin0918 / SRUM
View on GitHub
[ECCV 2026🔥] SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models
☆93Nov 26, 2025Updated 8 months ago
NOVAglow646 / Monet
View on GitHub
[CVPR 2026] Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"
☆215Mar 19, 2026Updated 4 months ago
LAW1223 / AlignVid
View on GitHub
☆24May 29, 2026Updated 2 months ago
iSEE-Laboratory / EgoExo-Fitness
View on GitHub
(ECCV 2024) Official repository of paper "EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding"
☆38Apr 8, 2025Updated last year
PKU-YuanGroup / UniSandBox
View on GitHub
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward
☆60Nov 27, 2025Updated 8 months ago
Cominclip / OmniVerifier
View on GitHub
[ICLR 2026 Oral & ICML 2026] Generative Universal Verifier as Multimodal Meta-Reasoner
☆64May 29, 2026Updated 2 months ago
zifuwanggg / Jigsaw-R1
View on GitHub
[TMLR 2025] Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles
☆15Oct 17, 2025Updated 9 months ago
arctanxarc / GENIUS
View on GitHub
☆43May 9, 2026Updated 2 months ago
onecat-ai / OneCAT
View on GitHub
OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation
☆261Sep 22, 2025Updated 10 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ympan0508 / aeslides
View on GitHub
A reinforcement learning framework with verifiable aesthetic rewards for improving aesthetic slide generation capabilities in LLM agents.…
☆30May 19, 2026Updated 2 months ago
haowei-freesky / HERMES
View on GitHub
Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding" [ACL 2026]
☆93May 8, 2026Updated 2 months ago
nnnth / UniLIP
View on GitHub
[ICLR 2026 🔥 ] Official implementation of "UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing"
☆151Jan 26, 2026Updated 6 months ago
Gen-Verse / HermesFlow
View on GitHub
[NeurIPS 2025] HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
☆77Sep 19, 2025Updated 10 months ago
GeekGuru123 / ProfilingDiT
View on GitHub
☆20Jan 1, 2026Updated 6 months ago
iSEE-Laboratory / ProEdit
View on GitHub
Official repository of paper "ProEdit: Inversion-based Editing From Prompts Done Right"
☆116Feb 5, 2026Updated 5 months ago
jacklishufan / Reflect-DiT
View on GitHub
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
☆56Aug 16, 2025Updated 11 months ago
daeunni / Video-Skill-CoT
View on GitHub
Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Findings]"
☆18Aug 27, 2025Updated 11 months ago
Hungryyan1 / UniCorn
View on GitHub
☆80Apr 12, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
wren93 / tuna
View on GitHub
☆94Apr 29, 2026Updated 2 months ago
PeiwenSun2000 / SpaceVista
View on GitHub
The official repo for SpaceVista: All-Scale Visual Spatial Reasoning from mm to km.
☆43May 26, 2026Updated 2 months ago
iSEE-Laboratory / DiffuVolume
View on GitHub
(IJCV2025) The official implementation of "DiffuVolume: Diffusion Model for Volume based Stereo Matching"
☆30Jan 15, 2025Updated last year
Run542968 / Awesome-3D-Human-Motion-Generation
View on GitHub
☆25Jul 24, 2024Updated 2 years ago
baaivision / Emu3.5
View on GitHub
Native Multimodal Models are World Learners
☆1,538Dec 30, 2025Updated 6 months ago
WeichenFan / UAE
View on GitHub
Official repo for UAE
☆209Jun 21, 2026Updated last month
G-U-N / UniRL
View on GitHub
[ICML 2026] a unified reinforcement learning toolbox for joint RL on language models and diffusion models
☆91May 26, 2026Updated 2 months ago