sarthmit / Mod_ArchLinks

☆32

Alternatives and similar repositories for Mod_Arch

Users that are interested in Mod_Arch are comparing it to the libraries listed below

Sorting:

google-research / head2toe
☆81Updated last year
dguo98 / DiffPruning
Parameter Efficient Transfer Learning with Diff Pruning
☆74Updated 4 years ago
xtinkt / editable
A supplementary code for Editable Neural Networks, an ICLR 2020 submission.
☆46Updated 5 years ago
MadryLab / EditingClassifiers
☆96Updated 3 years ago
google-deepmind / emergent_in_context_learning
☆85Updated last year
joshr17 / IFM
Code for paper "Can contrastive learning avoid shortcut solutions?" NeurIPS 2021.
☆47Updated 3 years ago
mansheej / data_diet
☆109Updated 2 years ago
RobertCsordas / transformer_generalization
The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…
☆67Updated 2 years ago
MadryLab / datamodels-data
Data for "Datamodels: Predicting Predictions with Training Data"
☆97Updated 2 years ago
Weixin-Liang / MetaShift
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)
☆109Updated 3 years ago
RAIVNLab / supsup
Code for "Supermasks in Superposition"
☆124Updated 2 years ago
sarthmit / Compositional-Attention
Code to reproduce the results for Compositional Attention
☆59Updated 3 years ago
varunnair18 / FISH
Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).
☆59Updated 3 years ago
VITA-Group / CV_LTH_Pre-training
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jon…
☆68Updated 2 years ago
RobertCsordas / modules
The official repository for our paper "Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks". We…
☆46Updated 2 years ago
RedRyan111 / GLOM
An implementation of 2021 paper by Geoffrey Hinton: "How to represent part-whole hierarchies in a neural network" in Pytorch.
☆57Updated 4 years ago
LAION-AI / Conditional-Pretraining-of-Large-Language-Models
☆37Updated 2 years ago
jiamings / ml-cpc
☆36Updated 5 years ago
mlfoundations / dataset2metadata
☆27Updated last year
lucidrains / learning-to-expire-pytorch
An implementation of Transformer with Expire-Span, a circuit for learning which memories to retain
☆34Updated 5 years ago
lucidrains / memformer
Implementation of Memformer, a Memory-augmented Transformer, in Pytorch
☆124Updated 5 years ago
mpezeshki / Gradient_Starvation
Gradient Starvation: A Learning Proclivity in Neural Networks
☆61Updated 4 years ago
mlfoundations / patching
Patching open-vocabulary models by interpolating weights
☆91Updated 2 years ago
p-lambda / incontext-learning
Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…
☆106Updated 2 years ago
rehg-lab / CLRec
Pytorch implementation for "The Surprising Positive Knowledge Transfer in Continual 3D Object Shape Reconstruction"
☆33Updated 3 years ago
IDSIA / recurrent-fwp
Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers" (NeurIPS 2021)
☆50Updated 5 months ago
ischlag / fast-weight-transformers
Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.
☆110Updated 4 years ago
bigscience-workshop / architecture-objective
☆98Updated 2 years ago
hlml / fortuitous_forgetting
☆19Updated 3 years ago
sIncerass / powernorm
[ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845
☆120Updated 4 years ago