XiaoduoAILab / XmodelVLMView external linksLinks
☆69Jun 20, 2024Updated last year
Alternatives and similar repositories for XmodelVLM
Users that are interested in XmodelVLM are comparing it to the libraries listed below
Sorting:
- [ACL 2025 Findings] Implicit Reasoning in Transformers is Reasoning through Shortcuts☆17Mar 11, 2025Updated 11 months ago
- ☆16Jul 23, 2024Updated last year
- XmodelLM☆38Nov 19, 2024Updated last year
- ☆124Jul 29, 2024Updated last year
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆24Sep 26, 2024Updated last year
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Aug 4, 2024Updated last year
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated last year
- [ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications☆52Oct 30, 2025Updated 3 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Jun 28, 2024Updated last year
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Mar 22, 2024Updated last year
- ☆43May 6, 2024Updated last year
- This is an implementation of the paper "Are We Done with Object-Centric Learning?"☆12Sep 11, 2025Updated 5 months ago
- LMM for VQA, tcsvt version☆11Jul 19, 2024Updated last year
- [ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations☆16Oct 18, 2025Updated 3 months ago
- ☆11Apr 21, 2025Updated 9 months ago
- Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Le…☆13Jan 16, 2025Updated last year
- ☆12Mar 18, 2024Updated last year
- TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment☆10Mar 1, 2025Updated 11 months ago
- ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL (ICLR 2025 Pytorch Code)☆17May 15, 2025Updated 8 months ago
- [CVPR 2025] MicroVQA eval and 🤖RefineBot code for "MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research"…☆32Nov 25, 2025Updated 2 months ago
- Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"☆43Apr 3, 2025Updated 10 months ago
- Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-World Application☆33Nov 12, 2024Updated last year
- ☆132Apr 17, 2025Updated 9 months ago
- ☆97Jun 23, 2025Updated 7 months ago
- PKU-I2IQA: An Image-to-Image Quality Assessment Database for AI Generated Images☆16Dec 4, 2024Updated last year
- ☆15Nov 7, 2024Updated last year
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆13Mar 17, 2025Updated 10 months ago
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- The official repo of continuous speculative decoding☆31Mar 28, 2025Updated 10 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆32May 1, 2025Updated 9 months ago
- Official PyTorch implementation of "No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding"☆32May 20, 2024Updated last year
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆64Jul 22, 2025Updated 6 months ago
- [TACL] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- ☆16Mar 26, 2025Updated 10 months ago
- Official implementation of "OpenCity3D: What do Vision-Language Models know about Urban Environments?" @ WACV2025☆16Nov 24, 2024Updated last year
- ☆21Jul 25, 2025Updated 6 months ago
- OpenMMLab Detection Toolbox and Benchmark for V3Det☆15Apr 3, 2024Updated last year
- ☆41Jun 9, 2025Updated 8 months ago
- This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Gener…☆17Jul 1, 2024Updated last year