DmitryRyumin / ICML-2025-PapersLinks
ICML 2025 Papers: Dive into cutting-edge research from the premier machine learning conference. Stay current with breakthroughs in deep learning, generative AI, optimization, reinforcement learning, and beyond. Code implementations included. ⭐ support the future of machine learning research!
☆30Updated 3 months ago
Alternatives and similar repositories for ICML-2025-Papers
Users that are interested in ICML-2025-Papers are comparing it to the libraries listed below
Sorting:
- The author's implementation of FUDOKI, a multimodal large language model purely based on discrete flow matching.☆67Updated last month
- A Collection of Papers on Diffusion Language Models☆152Updated 4 months ago
- [ACM-MM 2025 Workshop] More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment.☆25Updated 2 months ago
- This is the official implementation of 2024 CVPR paper "EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models".☆92Updated 2 months ago
- Official implement of paper "Revisiting Multimodal Positional Encoding in Vision–Language Models"☆55Updated last month
- [NAACL 2024] LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-text Generation?☆43Updated last year
- ☆58Updated 6 months ago
- This is for ACL 2025 Findings Paper: From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalitiesModels☆87Updated 3 weeks ago
- This is a collection of recent papers on reasoning in video generation models.☆95Updated 3 weeks ago
- Official PyTorch implementation of EMOVA in CVPR 2025 (https://arxiv.org/abs/2409.18042)☆76Updated 10 months ago
- [CVPR'25] MergeVQ: A Unified Framework for Visual Generation and Representation with Token Merging and Quantization☆47Updated 6 months ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆235Updated 5 months ago
- (NIPS 2025) OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Align…☆123Updated 2 months ago
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆347Updated 3 weeks ago
- [ICLR 2026] This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA bench…☆85Updated this week
- ✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints☆80Updated 6 months ago
- ☆137Updated last week
- ☆57Updated 2 weeks ago
- The official code of "Weak-to-Strong Diffusion with Reflection".☆55Updated 8 months ago
- [CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners☆155Updated last year
- Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"☆79Updated last month
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆179Updated 2 months ago
- ☆37Updated last week
- ☆31Updated 5 months ago
- MotionSight's official code implementation.☆44Updated 4 months ago
- UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation☆120Updated 3 weeks ago
- paper list, tutorial, and nano code snippet for Diffusion Large Language Models.☆152Updated last week
- Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning☆212Updated this week
- [ACMMM 2025 - Dataset Track] ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies☆22Updated 7 months ago
- Doodling our way to AGI ✏️ 🖼️ 🧠☆120Updated 8 months ago