roymiles / VeLoRALinks

[NeurIPS 2024] VeLoRA : Memory Efficient Training using Rank-1 Sub-Token Projections

☆21

Alternatives and similar repositories for VeLoRA

Users that are interested in VeLoRA are comparing it to the libraries listed below

Sorting:

gstoica27 / KnOTS
Model Merging with SVD to Tie the KnOTS [ICLR 2025]
☆62Updated 4 months ago
UCDvision / NOLA
Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"
☆55Updated 11 months ago
wjxts / RegularizedBN
☆21Updated 2 years ago
alexandertheus / Intra-Fusion
Towards Meta-Pruning via Optimal Transport, ICLR 2024 (Spotlight)
☆16Updated 8 months ago
rgeirhos / dataset-pruning-metrics
Metrics for "Beyond neural scaling laws: beating power law scaling via data pruning " (NeurIPS 2022 Outstanding Paper Award)
☆56Updated 2 years ago
facebookresearch / ViP-MAE
This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision
☆36Updated 2 years ago
james-oldfield / muMoE
[NeurIPS'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization
☆33Updated 10 months ago
tanganke / opcm
☆14Updated 6 months ago
zju-vipa / training_free_model_merging
This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).
☆31Updated last year
locuslab / T-MARS
Code for T-MARS data filtering
☆35Updated last year
andyjm3 / SLTrain
SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)
☆32Updated 9 months ago
PaulAlbert31 / RandLoRA
☆23Updated 2 months ago
VijayLingam95 / SVFT
☆30Updated 6 months ago
wy1iu / butterfly-oft
Official implementation of "Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization"
☆79Updated last year
ExplainableML / fomo_in_flux
Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]
☆57Updated 8 months ago
deep-spin / adasplash
AdaSplash: Adaptive Sparse Flash Attention (aka Flash Entmax Attention)
☆19Updated 3 weeks ago
ggjy / vision_weak_to_strong
☆38Updated last year
jongwooko / distillm-2
Official PyTorch implementation of DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)
☆34Updated last month
eric-ai-lab / PEViT
Official implementation of AAAI 2023 paper "Parameter-efficient Model Adaptation for Vision Transformers"
☆105Updated 2 years ago
zyxxmu / DSnoT
Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…
☆49Updated last year
NUS-HPC-AI-Lab / DD-Ranking
Data distillation benchmark
☆67Updated last month
TomerRonen34 / mixed-resolution-vit
☆51Updated last year
mwbini / ether
[ICML24] Official Implementation of "ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections"
☆16Updated last year
philippe-eecs / vitok
☆34Updated 2 months ago
gortizji / tangent_task_arithmetic
Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".
☆103Updated 2 years ago
mrflogs / LoRA-Pro
Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "
☆127Updated 4 months ago
Wang-ML-Lab / multimodal-needle-in-a-haystack
[NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models
☆48Updated 3 months ago
roymiles / ITRD
[BMVC 2022] Information Theoretic Representation Distillation
☆18Updated last year
locuslab / llava-token-compression
☆43Updated 9 months ago
pixeli99 / MixLN
[ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…
☆25Updated 2 weeks ago