OscarXZQ / weight-selection
☆172Updated 4 months ago
Alternatives and similar repositories for weight-selection:
Users that are interested in weight-selection are comparing it to the libraries listed below
- Object Recognition as Next Token Prediction (CVPR 2024 Highlight)☆170Updated last month
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆257Updated 9 months ago
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆100Updated 4 months ago
- Matryoshka Multimodal Models☆93Updated last week
- [ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.☆101Updated last month
- Official code for our CVPR'22 paper “Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space”☆248Updated last year
- ☆49Updated last year
- Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"☆123Updated 9 months ago
- Official code for "TOAST: Transfer Learning via Attention Steering"☆186Updated last year
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆188Updated 3 weeks ago
- 1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundatio…☆217Updated 5 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆138Updated 10 months ago
- A repository for DenseSSMs☆86Updated 9 months ago
- LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models☆113Updated 8 months ago
- A framework for merging models solving different tasks with different initializations into one multi-task model without any additional tr…☆290Updated last year
- This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"☆128Updated 7 months ago
- Official implementation of the Law of Vision Representation in MLLMs☆148Updated 2 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆50Updated 5 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆79Updated this week
- The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A su…☆220Updated 2 weeks ago
- [NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"☆118Updated 5 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆96Updated 4 months ago
- ✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models☆147Updated last month
- [ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…☆97Updated 8 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆72Updated 4 months ago
- [NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context☆144Updated 4 months ago
- ☆198Updated last year
- [TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆128Updated 2 months ago
- [ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"☆270Updated last month
- When do we not need larger vision models?☆357Updated last month