OscarXZQ / weight-selectionLinks
☆180Updated 8 months ago
Alternatives and similar repositories for weight-selection
Users that are interested in weight-selection are comparing it to the libraries listed below
Sorting:
- Official code for our CVPR'22 paper “Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space”☆249Updated last year
- Official code for "TOAST: Transfer Learning via Attention Steering"☆188Updated last year
- Matryoshka Multimodal Models☆107Updated 4 months ago
- Object Recognition as Next Token Prediction (CVPR 2024 Highlight)☆178Updated last month
- [ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.☆102Updated 5 months ago
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆101Updated 8 months ago
- ☆50Updated last year
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆294Updated 2 months ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆55Updated 9 months ago
- Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"☆123Updated last year
- [CVPR 2025 Highlight] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for C…☆248Updated 4 months ago
- [ICCV2023] Dataset Quantization☆258Updated last year
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆97Updated 8 months ago
- A repository for DenseSSMs☆87Updated last year
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆56Updated 2 months ago
- Dataset pruning for ImageNet and LAION-2B.☆79Updated 10 months ago
- [ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear at…☆101Updated 11 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆162Updated last year
- Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).☆129Updated last month
- ☆103Updated last year
- Official implementation for the paper "Prompt Pre-Training with Over Twenty-Thousand Classes for Open-Vocabulary Visual Recognition"☆257Updated last year
- Official implementation of the Law of Vision Representation in MLLMs☆155Updated 6 months ago
- 1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundatio…☆220Updated 9 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆92Updated last week
- [TMLR] Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆139Updated 6 months ago
- [ICML 2025] This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"☆132Updated 11 months ago
- A simple minimal implementation of Reversible Vision Transformers☆125Updated last year
- [NeurIPS'22] This is an official implementation for "Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning".☆182Updated last year
- ☆115Updated 10 months ago
- PB-LLM: Partially Binarized Large Language Models☆152Updated last year