gstoica27 / ZipIt
A framework for merging models solving different tasks with different initializations into one multi-task model without any additional training
☆290Updated last year
Alternatives and similar repositories for ZipIt:
Users that are interested in ZipIt are comparing it to the libraries listed below
- Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time☆438Updated 6 months ago
- Official code for our CVPR'22 paper “Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space”☆248Updated last year
- ☆198Updated last year
- Official code for "TOAST: Transfer Learning via Attention Steering"☆186Updated last year
- Learning from synthetic data - code and models☆307Updated last year
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆257Updated 9 months ago
- ☆172Updated 4 months ago
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch☆299Updated 7 months ago
- Object Recognition as Next Token Prediction (CVPR 2024 Highlight)☆170Updated last month
- ☆160Updated 11 months ago
- [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"☆309Updated 7 months ago
- PyTorch code for hierarchical k-means -- a data curation method for self-supervised learning☆141Updated 7 months ago
- [ICCV2023] Dataset Quantization☆256Updated last year
- Collection of Tools and Papers related to Adapters / Parameter-Efficient Transfer Learning/ Fine-Tuning☆179Updated 8 months ago
- Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"☆123Updated 9 months ago
- When do we not need larger vision models?☆357Updated last month
- ☆109Updated last year
- Code accompanying the paper "Massive Activations in Large Language Models"☆138Updated 10 months ago
- Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)☆112Updated 9 months ago
- Code release for "Dropout Reduces Underfitting"☆311Updated last year
- Editing Models with Task Arithmetic☆445Updated last year
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆100Updated 4 months ago
- DataComp: In search of the next generation of multimodal datasets☆674Updated last year
- ☆181Updated last year
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.☆294Updated last week
- A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).☆137Updated 3 weeks ago
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆50Updated 5 months ago
- Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)☆158Updated last year
- 1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundatio…☆217Updated 5 months ago
- [ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"☆270Updated last month