Aaronhuang-778 / MC-MoE
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More
☆28Updated 3 months ago
Alternatives and similar repositories for MC-MoE:
Users that are interested in MC-MoE are comparing it to the libraries listed below
- [NeurIPS'24]Efficient and accurate memory saving method towards W4A4 large multi-modal models.☆59Updated 2 weeks ago
- Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" proposed by Pekin…☆68Updated 3 months ago
- The official code of the paper "PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction".☆50Updated 2 weeks ago
- This is the official pytorch implementation for the paper: Towards Accurate Post-training Quantization for Diffusion Models.(CVPR24 Poste…☆35Updated 7 months ago
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆80Updated 3 months ago
- [ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation☆32Updated 4 months ago
- This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality"☆43Updated this week
- VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation☆203Updated last week
- A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"☆40Updated 7 months ago
- DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention☆117Updated last month
- Denoising Diffusion Step-aware Models (ICLR2024)☆53Updated 11 months ago
- Adaptive Caching for Faster Video Generation with Diffusion Transformers☆135Updated 2 months ago
- ☆12Updated 3 months ago
- The official code implementation of paper "Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models"☆22Updated 2 weeks ago
- torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.☆21Updated 9 months ago
- Liquid: Language Models are Scalable Multi-modal Generators☆60Updated last month
- 🔥 Aurora Series: A more efficient multimodal large language model series for video.☆62Updated 2 months ago
- VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation☆84Updated 4 months ago
- ☆128Updated last week
- Accelerating Diffusion Transformers with Token-wise Feature Caching☆47Updated last week
- [NeurIPS 2024] Efficient Multi-modal Models via Stage-wise Visual Context Compression☆50Updated 5 months ago
- [ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.☆58Updated 3 months ago
- [NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching☆91Updated 6 months ago
- Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.☆44Updated last month
- Open implementation of "RandAR"☆50Updated last week
- [CVPR'23] SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer☆61Updated 8 months ago
- [NeurIPS 24] MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks☆102Updated 2 months ago
- Official implementation of "Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization"☆76Updated 9 months ago
- ☆16Updated last year
- PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos☆36Updated last month