naver-ai / model-stock
Model Stock: All we need is just a few fine-tuned models
☆99Updated 3 months ago
Alternatives and similar repositories for model-stock:
Users that are interested in model-stock are comparing it to the libraries listed below
- ☆159Updated 11 months ago
- [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…☆62Updated 4 months ago
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆49Updated 3 months ago
- Patching open-vocabulary models by interpolating weights☆91Updated last year
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆38Updated 5 months ago
- Code for T-MARS data filtering☆35Updated last year
- Code accompanying the paper "Massive Activations in Large Language Models"☆133Updated 10 months ago
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆73Updated 5 months ago
- Language Quantized AutoEncoders☆95Updated last year
- [ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT☆52Updated 5 months ago
- Implementation of Infini-Transformer in Pytorch☆107Updated 2 weeks ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆70Updated 5 months ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆92Updated last year
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆37Updated last year
- Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"☆50Updated 4 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆75Updated 3 months ago
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆47Updated 7 months ago
- Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"☆123Updated 8 months ago
- Monet: Mixture of Monosemantic Experts for Transformers☆43Updated this week
- Official repo of Progressive Data Expansion: data, code and evaluation☆27Updated last year
- ☆124Updated 11 months ago
- Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)☆86Updated last year
- Matryoshka Multimodal Models☆90Updated last month
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆47Updated last month
- Code and data for the benchmark "Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Lan…☆36Updated 6 months ago
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆61Updated 2 months ago
- Language models scale reliably with over-training and on downstream tasks☆96Updated 9 months ago
- This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.☆92Updated 6 months ago
- Online Adaptation of Language Models with a Memory of Amortized Contexts (NeurIPS 2024)☆59Updated 5 months ago
- Recycling diverse models☆44Updated 2 years ago