[CVPR 2025 & IJCV2026] Official PyTorch Code for "MMRL: Multi-Modal Representation Learning for Vision-Language Models" and its extension "MMRL++: Parameter-Efficient and Interaction-Aware Representation Learning for Vision-Language Models".
☆97Feb 5, 2026Updated last month
Alternatives and similar repositories for MMRL
Users that are interested in MMRL are comparing it to the libraries listed below
Sorting:
- Official repo for ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models☆25Mar 24, 2025Updated 11 months ago
- Parameter-Efficient Fine-Tuning for Foundation Models☆111Mar 31, 2025Updated 11 months ago
- Generative Modeling with Bayesian Sample Inference☆24May 17, 2025Updated 9 months ago
- Official implementation of Nemesis: Normalizing the Soft-prompt Vectors of Vision-Language Models (ICLR 2024 Spotlight)☆15Dec 27, 2024Updated last year
- The official pytorch implemention of our IJCV-2025 paper "Learning with Enriched Inductive Biases for Vision-Language Models".☆14Mar 26, 2025Updated 11 months ago
- ☆22Jan 12, 2026Updated last month
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Feb 9, 2026Updated 3 weeks ago
- Adaptation of vision-language models (CLIP) to downstream tasks using local and global prompts.☆51Jul 10, 2025Updated 7 months ago
- The official pytorch implemention of our CVPR-2024 paper "MMA: Multi-Modal Adapter for Vision-Language Models".☆95Apr 24, 2025Updated 10 months ago
- ☆16Dec 16, 2024Updated last year
- ☆21Dec 2, 2025Updated 3 months ago
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆31Feb 26, 2026Updated last week
- The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual C…☆17May 27, 2024Updated last year
- [ICCV 2025] Official implementation of "What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?"☆18Aug 7, 2025Updated 6 months ago
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance☆17Sep 11, 2024Updated last year
- Code implementation of the paper 'FIction: 4D Future Interaction Prediction from Video'☆18Mar 19, 2025Updated 11 months ago
- Official code for ICCV 2023 paper, "Improving Zero-Shot Generalization for CLIP with Synthesized Prompts"☆103Mar 6, 2024Updated last year
- Out-of-Distribution Semantic Occupancy Prediction☆20Oct 22, 2025Updated 4 months ago
- ☆27Dec 8, 2025Updated 2 months ago
- \infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation☆19Feb 14, 2025Updated last year
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 6 months ago
- ☆50Mar 14, 2025Updated 11 months ago
- The official implementation of ICLR 2025 paper "Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models".☆18Apr 25, 2025Updated 10 months ago
- [ICLR'25] Code for KaSA, an official implementation of "KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models"☆20Jan 16, 2025Updated last year
- [ICLR 2026] Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing☆29Feb 6, 2026Updated 3 weeks ago
- [NeurIPS 2025] Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO☆79Oct 29, 2025Updated 4 months ago
- [ICCV 2025] Official implementation of "AD-GS: Object-Aware B-Spline Gaussian Splatting for Self-Supervised Autonomous Driving"☆35Jul 15, 2025Updated 7 months ago
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆35Aug 28, 2025Updated 6 months ago
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆21Feb 27, 2025Updated last year
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning☆57Oct 10, 2025Updated 4 months ago
- ☆33Jul 15, 2025Updated 7 months ago
- Implementation for AutoIOT: LLM-Driven Automated Natural Language Programming for AIoT Applications☆34Apr 21, 2025Updated 10 months ago
- KeypointDETR: An End-to-End 3D Keypoint Detector [ECCV 2024 Oral]☆26Oct 6, 2024Updated last year
- Unraveling the Effects of Synthetic Data on End-to-End Autonomous Driving☆32Nov 20, 2025Updated 3 months ago
- [CVPR2025] FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression☆61Oct 10, 2025Updated 4 months ago
- A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.☆755Dec 1, 2025Updated 3 months ago
- ☆30Jan 18, 2026Updated last month
- [NeurIPS'25] SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning☆41Oct 14, 2025Updated 4 months ago
- Finetuning and inference tools for the CogView4 and CogVideoX model series.☆118May 14, 2025Updated 9 months ago