fattorib / Flax-ResNetsLinks

CIFAR10 ResNets implemented in JAX+Flax

☆12

Alternatives and similar repositories for Flax-ResNets

Users that are interested in Flax-ResNets are comparing it to the libraries listed below

Sorting:

ayulockin / LossLandscape
Explores the ideas presented in Deep Ensembles: A Loss Landscape Perspective (https://arxiv.org/abs/1912.02757) by Stanislav Fort, Huiyi …
☆66Updated 5 years ago
facebookresearch / Pitfalls-of-Memorization
Understanding the interplay between memorization and generalization in neural networks, featuring MAT, a learning algorithm to enhance ro…
☆40Updated 11 months ago
AnanyaKumar / transfer_learning
Framework code with wandb, checkpointing, logging, configs, experimental protocols. Useful for fine-tuning models or training from scratc…
☆152Updated 2 years ago
gregorbachmann / scaling_mlps
☆52Updated last year
jiaweizzhao / ZerO-initialization
☆75Updated 3 years ago
tml-epfl / understanding-sam
Towards Understanding Sharpness-Aware Minimization [ICML 2022]
☆36Updated 3 years ago
edwardjhu / TP4
Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)
☆63Updated 4 years ago
JeanKaddour / NoTrainNoGain
Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)
☆81Updated 2 years ago
naver-ai / model-stock
Model Stock: All we need is just a few fine-tuned models
☆127Updated 3 months ago
facebookresearch / ModelRatatouille
Recycling diverse models
☆46Updated 2 years ago
JeanKaddour / LAWA
Latest Weight Averaging (NeurIPS HITY 2022)
☆31Updated 2 years ago
google-research / jax-influence
☆63Updated 3 years ago
alexrame / diwa
DiWA: Diverse Weight Averaging for Out-of-Distribution Generalization
☆31Updated 2 years ago
srush / do-we-need-attention
☆166Updated 2 years ago
tml-epfl / why-weight-decay
Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]
☆68Updated last year
KellerJordan / REPAIR
Code release for REPAIR: REnormalizing Permuted Activations for Interpolation Repair
☆51Updated last year
google-deepmind / distribution_shift_framework
This repository contains the code of the distribution shift framework presented in A Fine-Grained Analysis on Distribution Shift (Wiles e…
☆84Updated 3 weeks ago
fjzzq2002 / random_transformers
Official code for "Algorithmic Capabilities of Random Transformers" (NeurIPS 2024)
☆15Updated last year
lucidrains / pause-transformer
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…
☆53Updated 2 years ago
stanislavfort / dissect-git-re-basin
Replicating and dissecting the git-re-basin project in one-click-replication Colabs
☆36Updated 3 years ago
Ping-C / optimizer
This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…
☆40Updated 2 years ago
js-d / sim_metric
☆37Updated 2 years ago
tding1 / Neural-Collapse
[NeurIPS 2021] A Geometric Analysis of Neural Collapse with Unconstrained Features
☆59Updated 3 years ago
gibipara92 / learning-explanations-hard-to-vary
Code to implement the AND-mask and geometric mean to do gradient based optimization, from the paper "Learning explanations that are hard …
☆41Updated 5 years ago
RAIVNLab / supsup
Code for "Supermasks in Superposition"
☆124Updated 2 years ago
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆113Updated 10 months ago
krafton-ai / mambaformer-icl
MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248
☆57Updated last year
hlml / fortuitous_forgetting
☆19Updated 3 years ago
apple / learning-subspaces
☆133Updated 4 years ago
hsouri / BayesianTransferLearning
☆109Updated 3 years ago