VILA-Lab / DELTLinks
(CVPR 2025) Official implementation to DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation which outperforms SOTA top 1-acc by +1.3% and increases diversity per class by +5%
☆25Updated 3 months ago
Alternatives and similar repositories for DELT
Users that are interested in DELT are comparing it to the libraries listed below
Sorting:
- [ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models☆79Updated 6 months ago
- Data distillation benchmark☆71Updated 6 months ago
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆112Updated 5 months ago
- Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning☆52Updated last month
- PyTorch code for "ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning"☆21Updated last year
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆211Updated 11 months ago
- [ICML 2025] This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"☆143Updated last year
- [WACV 2025] Official implementation of "Online-LoRA: Task-free Online Continual Learning via Low Rank Adaptation" by Xiwen Wei, Guihong L…☆52Updated 3 months ago
- [MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆139Updated 4 months ago
- [TMLR'24] This repository includes the official implementation our paper "FedConv: Enhancing Convolutional Neural Networks for Handling D…☆25Updated last year
- ☆27Updated last year
- ☆105Updated 6 months ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆123Updated 4 months ago
- [AAAI2025] ChatterBox: Multi-round Multimodal Referring and Grounding, Multimodal, Multi-round dialogues☆58Updated 7 months ago
- Matryoshka Multimodal Models☆120Updated 10 months ago
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)☆25Updated 9 months ago
- Adapting LLaMA Decoder to Vision Transformer☆30Updated last year
- [TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models☆144Updated 2 months ago
- [NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆63Updated 2 months ago
- [CVPR 2024] Prompt Highlighter: Interactive Control for Multi-Modal LLMs☆156Updated last year
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆166Updated 2 months ago
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆80Updated 8 months ago
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆51Updated 7 months ago
- [NeurIPS 2025] Thinkless: LLM Learns When to Think☆245Updated 2 months ago
- An open source implementation of CLIP (With TULIP Support)☆163Updated 6 months ago
- Official Implementation of "Pix2Cap-COCO: Advancing Visual Comprehension via Pixel-Level Captioning"☆24Updated last month
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆86Updated 2 months ago
- official repo for paper "[CLS] Token Tells Everything Needed for Training-free Efficient MLLMs"☆23Updated 7 months ago
- PyTorch Implementation of Zero-Shot Vision Encoder Grafting via LLM Surrogates [ICCV'25]☆51Updated 5 months ago
- [NIPS2023]Implementation of Foundation Model is Efficient Multimodal Multitask Model Selector☆37Updated last year