Adlith / MoE-Jetpack
[NeurIPS 24] MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks
☆101Updated last month
Alternatives and similar repositories for MoE-Jetpack:
Users that are interested in MoE-Jetpack are comparing it to the libraries listed below
- ☆61Updated 2 months ago
- DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution☆39Updated 2 months ago
- [NeurIPS 2024 Spotlight ⭐️] Parameter-Inverted Image Pyramid Networks (PIIP)☆77Updated this week
- [ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference☆74Updated 4 months ago
- Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.☆44Updated last month
- ☆108Updated 5 months ago
- This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model☆91Updated 6 months ago
- Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆60Updated 5 months ago
- [ECCV 2024] ControlCap: Controllable Region-level Captioning☆64Updated 2 months ago
- Learning 1D Causal Visual Representation with De-focus Attention Networks☆32Updated 7 months ago
- Official repository of the paper "Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation"☆34Updated 3 weeks ago
- Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" proposed by Pekin…☆67Updated 2 months ago
- Text4Seg: Reimagining Image Segmentation as Text Generation☆40Updated 2 weeks ago
- Liquid: Language Models are Scalable Multi-modal Generators☆60Updated last month
- OVMR: Open-Vocabulary Recognition with Multi-Modal References (CVPR24)☆23Updated last month
- This is the official repo for ByteVideoLLM/Dynamic-VLM☆18Updated last month
- ☆58Updated last year
- ☆32Updated 3 weeks ago
- The official code of the paper "PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction".☆51Updated last week
- [CVPR 2023] RILS: Masked Visual Reconstruction in Language Semantic Space (https://arxiv.org/abs/2301.06958)☆44Updated last year
- The official implementation of "Adapter is All You Need for Tuning Visual Tasks".☆76Updated 4 months ago
- ☆29Updated 9 months ago
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆32Updated 7 months ago
- MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer☆36Updated 4 months ago
- [NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding☆39Updated this week
- [TPAMI reviewing] Towards Visual Grounding: A Survey☆42Updated last week
- Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…☆21Updated 9 months ago
- [AAAI 2025] Linear-complexity Visual Sequence Learning with Gated Linear Attention☆103Updated 7 months ago
- [NeurIPS'24] A Simple Image Segmentation Framework via In-Context Examples☆46Updated 2 months ago
- [CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"☆67Updated 3 months ago