Adlith / MoE-Jetpack
[NeurIPS 24] MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks
☆19Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for MoE-Jetpack
- Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" proposed by Pekin…☆52Updated 3 weeks ago
- MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer☆33Updated 2 months ago
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆89Updated last month
- Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…☆22Updated 7 months ago
- [CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models☆89Updated 2 months ago
- Automatically update arXiv papers about SOT & VLT, Multi-modal Learning, LLM and Video Understanding using Github Actions.☆17Updated this week
- DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution☆39Updated this week
- A paper list of some recent works about Token Compress for Vit and VLM☆133Updated this week
- ☆22Updated 4 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆64Updated 3 weeks ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆78Updated 7 months ago
- ☆78Updated 9 months ago
- ☆21Updated 3 months ago
- The official implementation of "2024NeurIPS Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation"☆36Updated 3 weeks ago
- The official implementation of RAR☆72Updated 7 months ago
- The official code of the paper "PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction".☆42Updated last week
- This is a PyTorch implementation of MCLN proposed by our paper "Multi-branch Collaborative Learning Network for 3D Visual Grounding"(ECCV…☆11Updated last month
- state-of-the-art open vocabulary detector on COCO/LVIS/V3Det☆25Updated 6 months ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆106Updated 2 weeks ago
- [ICLR 2024] ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation☆52Updated 6 months ago
- [NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding☆28Updated 3 weeks ago
- [NeurIPS 2023] Rank-DETR for High Quality Object Detection☆87Updated last year
- CVPR2024: Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models☆60Updated 4 months ago
- Official implementation of "Why are Visually-Grounded Language Models Bad at Image Classification?" (NeurIPS 2024)☆49Updated 3 weeks ago
- [CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"☆59Updated last month
- OVMR: Open-Vocabulary Recognition with Multi-Modal References (CVPR24)☆21Updated last month
- Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]☆94Updated last year
- FreeVA: Offline MLLM as Training-Free Video Assistant☆48Updated 5 months ago
- Text4Seg: Reimagining Image Segmentation as Text Generation☆22Updated 3 weeks ago
- Task Residual for Tuning Vision-Language Models (CVPR 2023)☆65Updated last year