wangf3014 / Patch_Scaling
Official implementation of Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
☆17Updated last month
Alternatives and similar repositories for Patch_Scaling:
Users that are interested in Patch_Scaling are comparing it to the libraries listed below
- Adapting LLaMA Decoder to Vision Transformer☆28Updated 10 months ago
- ☆24Updated last month
- Official implementation for FlexAttention for Efficient High-Resolution Vision-Language Models☆38Updated 2 months ago
- CLIP-MoE: Mixture of Experts for CLIP☆29Updated 5 months ago
- Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"☆26Updated 5 months ago
- [NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models