FeipengMa6/VLoRA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/FeipengMa6/VLoRA)

FeipengMa6 / VLoRA

[NeurIPS 2024] Visual Perception by Large Language Model’s Weights

☆56

Alternatives and similar repositories for VLoRA

Users that are interested in VLoRA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Accio-Lab / SwimBird
View on GitHub
☆18Apr 9, 2026Updated 3 months ago
yongliang-wu / ExploreCfg
View on GitHub
[NeurIPS2023] Exploring Diverse In-Context Configurations for Image Captioning
☆47Nov 26, 2024Updated last year
Accio-Lab / Metis
View on GitHub
Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models
☆35Apr 10, 2026Updated 3 months ago
arijitray1993 / COLA
View on GitHub
COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!
☆25May 14, 2026Updated 2 months ago
wenyu1009 / RTSRN
View on GitHub
☆20Sep 19, 2023Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
JaaackHongggg / WorldSense
View on GitHub
WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
☆50Jul 12, 2026Updated 2 weeks ago
sd0809 / EoFormer
View on GitHub
EoFormer: Edge-oriented Transformer for Brain Tumor Segmentation
☆26Jul 7, 2024Updated 2 years ago
YuxiXie / V-DPO
View on GitHub
Preference Learning for LLaVA
☆60Nov 9, 2024Updated last year
McGill-NLP / AURORA
View on GitHub
Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation
☆35Jun 30, 2025Updated last year
WillDreamer / Awesome-MLLM-Reasoning
View on GitHub
Recent Advances on MLLM's Reasoning Ability
☆26Apr 11, 2025Updated last year
GasolSun36 / MVP
View on GitHub
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning
☆24Sep 9, 2024Updated last year
PKU-YuanGroup / LLMBind
View on GitHub
LLMBind: A Unified Modality-Task Integration Framework
☆19Jun 16, 2024Updated 2 years ago
mbzuai-oryx / TrackingMeetsLMM
View on GitHub
☆10Apr 7, 2025Updated last year
jylins / hourllava
View on GitHub
[NeurIPS 2025 Spotlight] Unleashing Hour-Scale Video Training for Long Video-Language Understanding
☆19Jun 24, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
DataArcTech / ChartBench
View on GitHub
☆16May 15, 2025Updated last year
booker-max / Unsupervised-Deraining-with-Event-Camera
View on GitHub
☆25Oct 7, 2024Updated last year
mair-lab / EARL
View on GitHub
EARL: Editing with Autoregression and RL
☆43Nov 21, 2025Updated 8 months ago
Espere-1119-Song / Video-MMLU
View on GitHub
A Massive Multi-Discipline Lecture Understanding Benchmark
☆34Apr 20, 2026Updated 3 months ago
HebeiFast / EventLowLightVOS
View on GitHub
☆13Jun 5, 2024Updated 2 years ago
OpenGVLab / MUTR
View on GitHub
「AAAI 2024」 Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation
☆85Jun 13, 2025Updated last year
shilinyan99 / CrossLMM
View on GitHub
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
☆25Dec 21, 2025Updated 7 months ago
thunlp / Migician
View on GitHub
[ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models
☆90May 20, 2025Updated last year
Fsoft-AIC / Z-GMOT
View on GitHub
[NAACL 2024] Z-GMOT: Zero-shot Generic Multiple Object Tracking
☆12May 19, 2026Updated 2 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
kxfan2002 / SophiaVL-R1
View on GitHub
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
☆94Aug 8, 2025Updated 11 months ago
minhoooo1 / CatMAE
View on GitHub
CatMAE
☆15Dec 13, 2023Updated 2 years ago
lucasjinreal / wnnx_models
View on GitHub
Various test models in WNNX format. It can view with `pip install wnetron && wnetron`
☆12Jun 22, 2022Updated 4 years ago
yliu-cs / PiTe
View on GitHub
[ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model
☆17Feb 13, 2025Updated last year
hrtang22 / MUSE
View on GitHub
Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"
☆26Feb 2, 2025Updated last year
zlab-princeton / UEval
View on GitHub
UEval: A Benchmark for Unified Multimodal Generation
☆24Apr 20, 2026Updated 3 months ago
bronyayang / Law_of_Vision_Representation_in_MLLMs
View on GitHub
[COLM'25] Official implementation of the Law of Vision Representation in MLLMs
☆177Oct 6, 2025Updated 9 months ago
LgQu / TIGeR
View on GitHub
Code for paper: Unified Text-to-Image Generation and Retrieval
☆16Jul 19, 2026Updated last week
zeyofu / Commonsense-T2I
View on GitHub
Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]
☆24Aug 13, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
microsoft / act
View on GitHub
AML Command Transfer. A lightweight tool to transfer any command line to Azure Machine Learning Services
☆20May 23, 2024Updated 2 years ago
TIGER-AI-Lab / VISTA
View on GitHub
The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]
☆20Feb 27, 2025Updated last year
MengLcool / SliMM
View on GitHub
☆25Dec 26, 2024Updated last year
DocTron-hub / Chart-R1
View on GitHub
Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner
☆24Aug 7, 2025Updated 11 months ago
mlvlab / ST-VLM
View on GitHub
☆13Mar 28, 2025Updated last year
witnessai / Awesome-Zero-Shot-Object-Detection
View on GitHub
A curated list of papers, datasets and resources pertaining to zero-shot object detection.
☆29Mar 15, 2023Updated 3 years ago
yongliang-wu / NumPro
View on GitHub
[CVPR2025] Number it: Temporal Grounding Videos like Flipping Manga
☆150Jan 19, 2026Updated 6 months ago