MeganTj / multimodal_alignmentLinks
☆16Updated 4 months ago
Alternatives and similar repositories for multimodal_alignment
Users that are interested in multimodal_alignment are comparing it to the libraries listed below
Sorting:
- ☆21Updated 11 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆49Updated 5 months ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆16Updated 7 months ago
- [TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"☆20Updated 2 years ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆31Updated 5 months ago
- ☆23Updated 4 months ago
- Holistic evaluation of multimodal foundation models☆48Updated last year
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆59Updated 10 months ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆47Updated last month
- Official implementation of RMoE (Layerwise Recurrent Router for Mixture-of-Experts)☆24Updated last year
- Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?☆15Updated 4 months ago
- Project for SNARE benchmark☆11Updated last year
- CS194-196 Course Project☆15Updated 8 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆55Updated last year
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆30Updated 10 months ago
- [NeurIPS 2025] Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models☆39Updated 6 months ago
- Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Updated 11 months ago
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆20Updated 10 months ago
- [CVPR 2025] MicroVQA eval and 🤖RefineBot code for "MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research"…☆25Updated last week
- Distributed Optimization Infra for learning CLIP models☆27Updated last year
- ☆55Updated last year
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202…☆37Updated 5 months ago
- ☆44Updated 5 months ago
- [TMLR 2022] High-Modality Multimodal Transformer☆117Updated 11 months ago
- [ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models☆78Updated 4 months ago
- ☆19Updated 9 months ago
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆46Updated last year
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆31Updated 11 months ago
- ☆44Updated 11 months ago
- Resa: Transparent Reasoning Models via SAEs☆44Updated last month