[NeurIPS 2025] The official PyTorch implementation of the "Vision Function Layer in MLLM".
☆28Dec 18, 2025Updated 2 months ago
Alternatives and similar repositories for Vision-Function-Layer
Users that are interested in Vision-Function-Layer are comparing it to the libraries listed below
Sorting:
- [ECCV 2024] The official PyTorch implementation of the "Part2Object: Hierarchical Unsupervised 3D Instance Segmentation".☆25Sep 12, 2024Updated last year
- [CVPR2025] Rethinking Query-based Transformer for Continual Image Segmentation☆41Jul 16, 2025Updated 7 months ago
- [ICLR 2025] MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow☆26Apr 9, 2025Updated 10 months ago
- UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation☆46Aug 26, 2025Updated 6 months ago
- [ECCV 2024] The official PyTorch implementation of the "Plain-Det: A Plain Multi-Dataset Object Detector".☆30Dec 8, 2024Updated last year
- Official repository for ACM Multimedia'23 paper "MATK: The Meme Analytical Tool Kit"☆13May 29, 2024Updated last year
- [TOG 2025] Order Matters: Learning Element Ordering for Graphic Design Generation☆20Aug 5, 2025Updated 6 months ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Official PyTorch implementation of The Linear Attention Resurrection in Vision Transformer☆16Sep 7, 2024Updated last year
- ☆12Aug 25, 2021Updated 4 years ago
- 校园志愿者招募平台的设计与实现,2022年哈尔滨工业大学(深圳)《数据库系统》课程实验四 | design and implementation of a campus volunteer recruitment platform, lab4 of the course "…☆11Aug 25, 2023Updated 2 years ago
- 🚀 Sliding Window Attention Training for Efficient Large Language Models☆16Dec 8, 2025Updated 2 months ago
- [ICCV 2023] Code for "Multi-task View Synthesis with Neural Radiance Fields"☆11Oct 2, 2023Updated 2 years ago
- Exposing Text-Image Inconsistency Using Diffusion Models (ICLR 2024)☆10Jun 15, 2024Updated last year
- ☆18May 15, 2025Updated 9 months ago
- A PyTorch implementation of the paper "MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis".☆12Jan 16, 2023Updated 3 years ago
- ☆12Apr 26, 2022Updated 3 years ago
- ☆14May 20, 2025Updated 9 months ago
- [AAAI 2025] Official pytorch implementation of "Diffusion Model Patching via Mixture-of-Prompts"☆13Dec 12, 2024Updated last year
- [CVPR 2025] GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration☆20Mar 21, 2025Updated 11 months ago
- Native AI 是一个探索本地生活电商领域的多智能体系统,通过 AI 助手一站式解决用户吃喝玩乐住行等日常生活需求。系统基于大语言模型技术,主要为了探索Multi Agent的应用。☆12Apr 13, 2025Updated 10 months ago
- ☆11Nov 30, 2025Updated 3 months ago
- Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".☆12Oct 25, 2023Updated 2 years ago
- [ICLR 2025] Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception☆14Jul 4, 2025Updated 7 months ago
- ☆10Dec 12, 2023Updated 2 years ago
- [NeurIPS '25] FastDINOv2: Frequency Based Curriculum Learning Improves Robustness and Training Speed☆25Jul 26, 2025Updated 7 months ago
- ☆17Mar 19, 2022Updated 3 years ago
- Dataflow-MM, multi-media operators for Dataflow. We aim to prepare data for Multimodal Large Language Models.☆30Feb 15, 2026Updated 2 weeks ago
- [NeurIPS 2022] Explaining Graph Neural Networks with Structure-Aware Cooperative Games (GStarX)☆14Oct 20, 2022Updated 3 years ago
- 🔥This is a curated list of "A survey on Efficient Vision-Language Action Models" research. We will continue to maintain and update the r…☆131Jan 5, 2026Updated last month
- ICML2025☆63Aug 28, 2025Updated 6 months ago
- Simplified Diffusion Schrödinger Bridge☆13Apr 19, 2024Updated last year
- ☆11Mar 3, 2025Updated 11 months ago
- Responsible Visual Editing☆15Jul 10, 2024Updated last year
- Code Repository for the NeurIPS 2022 paper: "Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights".☆17Jul 10, 2024Updated last year
- ☆13Jul 10, 2024Updated last year
- Evaluation codes of "From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models".☆16May 15, 2023Updated 2 years ago
- 🚀 海南大学编译原理 pl0 语言编译器扩充☆10Dec 19, 2020Updated 5 years ago
- Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting☆14Dec 19, 2025Updated 2 months ago