Aaronhuang-778 / MC-MoE
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More
☆20Updated last month
Related projects ⓘ
Alternatives and complementary repositories for MC-MoE
- Denoising Diffusion Step-aware Models (ICLR2024)☆52Updated 9 months ago
- [ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.☆57Updated last month
- [NeurIPS'24]Efficient and accurate memory saving method towards W4A4 large multi-modal models.☆41Updated last month
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆26Updated 5 months ago
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆76Updated last month
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆61Updated last week
- Code for "Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes"☆51Updated 7 months ago
- DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention☆113Updated 5 months ago
- Official implementation of "Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization"☆75Updated 7 months ago
- [3DV 2025] Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model☆47Updated 5 months ago
- [CVPR 2023] Vote2Cap-DETR and [T-PAMI 2024] Vote2Cap-DETR++; A set-to-set perspective towards 3D Dense Captioning; State-of-the-Art 3D De…☆86Updated 3 months ago
- VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation☆144Updated last month
- This is a repo to track the latest autoregressive visual generation papers.☆50Updated this week
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆78Updated 10 months ago
- [ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation☆31Updated 2 months ago
- Empowering Unified MLLM with Multi-granular Visual Generation☆106Updated last month
- [ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities☆53Updated last month
- ☆36Updated last year
- A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"☆34Updated 5 months ago
- 😎 Awesome lists of papers and codes about open-vocabulary perception, including both 3D and 2D☆24Updated 4 months ago
- Official code of DMA: Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding, ECCV 2024☆24Updated 4 months ago
- Can 3D Vision-Language Models Truly Understand Natural Language?☆21Updated 7 months ago
- [NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding☆48Updated 3 weeks ago
- (ICCV2023) IST-Net: Prior-free Category-level Pose Estimation with Implicit Space Transformation☆106Updated 11 months ago
- The official source code for "X-Ray: A Sequential 3D Representation for Generation".☆96Updated 5 months ago
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics☆55Updated last month
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆26Updated 3 weeks ago
- This is the official implementation for ControlVAR.☆57Updated last month
- 🔥ImageFolder: Autoregressive Image Generation with Folded Tokens☆57Updated last week
- A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World☆172Updated last month