oneHuster / MixupELinks

Codes for "MixupE: Understanding and Improving Mixup from Directional Derivative Perspective" UAI 2023 Oral

☆27

Alternatives and similar repositories for MixupE

Users that are interested in MixupE are comparing it to the libraries listed below

Sorting:

ZhaoxuanWu / DAVINZ-DataValuation
Training-free data valuation on deep neural network applications. (ICML-2022)
☆24Updated 2 years ago
gortizji / tangent_task_arithmetic
Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".
☆102Updated last year
haonan3 / AnchorContext
AnchorAttention: Improved attention for LLMs long-context training
☆208Updated 4 months ago
harveyhuang18 / EMR_Merging
[NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging
☆59Updated 3 months ago
EnnengYang / AdaMerging
AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.
☆82Updated 7 months ago
Wang-ML-Lab / bayesian-peft
[NeurIPS 2024] BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
☆27Updated 4 months ago
keven980716 / weak-to-strong-deception
[ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"
☆13Updated 11 months ago
peterljq / Parsimonious-Concept-Engineering
PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)
☆35Updated 7 months ago
JingXuTHU / Random-Masking-Finds-Winning-Tickets-for-Parameter-Efficient-Fine-tuning
☆13Updated last year
pipilurj / ROBOT
☆27Updated 2 years ago
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆85Updated 7 months ago
2003pro / ScaleBiO
This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting
☆19Updated 10 months ago
uiuctml / Localize-and-Stitch
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
☆25Updated 4 months ago
sail-sg / lm-random-memory-access
☆14Updated last year
TsinghuaC3I / SoRA
[EMNLP 2023, Main Conference] Sparse Low-rank Adaptation of Pre-trained Language Models
☆76Updated last year
adymaharana / d2pruning
☆35Updated last year
thunlp / Modularity-Analysis
Repo for ACL2023 Findings paper "Emergent Modularity in Pre-trained Transformers"
☆23Updated last year
WeiHuang05 / Awesome_Large_Foundation_Model_Theory
Welcome to the 'In Context Learning Theory' Reading Group
☆28Updated 6 months ago
EnnengYang / RepresentationSurgery
Representation Surgery for Multi-Task Model Merging. ICML, 2024.
☆45Updated 7 months ago
nik-dim / tall_masks
Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]
☆44Updated 7 months ago
decoding-comp-trust / comp-trust
Codebase for decoding compressed trust.
☆23Updated last year
AntoAndGar / task_singular_vectors
Task Singular Vectors: Reducing Task Interference in Model Merging. Merge models avoiding task interference through separable models.
☆14Updated last week
Model-GLUE / Model-GLUE
☆15Updated 9 months ago
mmatena / model_merging
☆67Updated 3 years ago
JJchy / CG_score
Data Valuation without Training of a Model, submitted to ICLR'23
☆23Updated 2 years ago
sail-sg / ActivePRM
☆15Updated last month
VITA-Group / Junk_DNA_Hypothesis
[ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…
☆16Updated last month
pilancilab / Riemannian_Preconditioned_LoRA
source code for paper "Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models"
☆26Updated 11 months ago
tanganke / weight-ensembling_MoE
Code for paper "Merging Multi-Task Models via Weight-Ensembling Mixture of Experts"
☆24Updated 11 months ago
abhishekpanigrahi1996 / Skill-Localization-by-grafting
☆49Updated last year