Nusrat-Prottasha / PEFT-A2ZLinks

☆30

Alternatives and similar repositories for PEFT-A2Z

Users that are interested in PEFT-A2Z are comparing it to the libraries listed below

Sorting:

waltonfuture / RL-with-Cold-Start
SFT+RL boosts multimodal reasoning
☆32Updated 3 months ago
kxfan2002 / SophiaVL-R1
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
☆79Updated last month
Liuziyu77 / MIA-DPO
Official implement of MIA-DPO
☆66Updated 8 months ago
yuecao0119 / MMFuser
The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …
☆58Updated 10 months ago
OpenSparseLLMs / CLIP-MoE
CLIP-MoE: Mixture of Experts for CLIP
☆46Updated 11 months ago
Dongping-Chen / ISG
(ICLR 2025 Spotlight) Official code repository for Interleaved Scene Graph.
☆28Updated last month
sterzhang / PVIT
Official Repository of Personalized Visual Instruct Tuning
☆32Updated 6 months ago
yu-rp / Dimple
Dimple, the first Discrete Diffusion Multimodal Large Language Model
☆98Updated 2 months ago
hulianyuyy / iLLaVA
iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models
☆18Updated 7 months ago
THU-MIG / VTC-CLS
official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"
☆23Updated 5 months ago
iancovert / locality-alignment
☆52Updated 8 months ago
OpenGVLab / PVC
[CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
☆47Updated 3 months ago
waltonfuture / MM-UPT
[NeurIPS 2025] Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO
☆48Updated last week
MME-Benchmarks / MME-Unify
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
☆41Updated 5 months ago
findalexli / mllm-dpo
[ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model
☆47Updated 10 months ago
OpenGVLab / MMIU
[ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
☆86Updated last year
UMass-Embodied-AGI / FlexAttention
[ECCV 2024] FlexAttention for Efficient High-Resolution Vision-Language Models
☆43Updated 8 months ago
THUDM / Awesome-Parameter-Efficient-Fine-Tuning-for-Foundation-Models
Parameter-Efficient Fine-Tuning for Foundation Models
☆93Updated 5 months ago
chancharikmitra / SAVs
Official Codebase for "Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers"
☆16Updated 3 months ago
Haochen-Wang409 / TreeVGR
Official implementation of "Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology"
☆64Updated 2 months ago
GAIR-NLP / thinking-with-generated-images
Doodling our way to AGI ✏️ 🖼️ 🧠
☆103Updated 3 months ago
TIGER-AI-Lab / QuickVideo
Quick Long Video Understanding
☆64Updated 3 months ago
LanceZPF / OpenING
Official Implementation of OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation
☆32Updated 2 months ago
savadikarc / wegeft
WeGeFT: Weight‑Generative Fine‑Tuning for Multi‑Faceted Efficient Adaptation of Large Models
☆21Updated 2 months ago
markywg / transagent
[NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration
☆24Updated 11 months ago
Mr-Loevan / FAST
Fast-Slow Thinking for Large Vision-Language Model Reasoning
☆18Updated 4 months ago
czg1225 / VeriThinker
[NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient
☆53Updated last week
GeWu-Lab / MokA
MokA: Multimodal Low-Rank Adaptation for MLLMs
☆29Updated 3 months ago
yuecao0119 / MMInstruct
[SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…
☆57Updated 10 months ago
deepglint / UniME
[ACM MM25] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"
☆90Updated last month