BobMcDear / simsiam-pytorch
PyTorch implementation of SimSiam
☆8Updated last year
Alternatives and similar repositories for simsiam-pytorch:
Users that are interested in simsiam-pytorch are comparing it to the libraries listed below
- ☆9Updated last year
- Layerwise Batch Entropy Regularization☆22Updated 2 years ago
- Code for Accelerated Linearized Laplace Approximation for Bayesian Deep Learning (ELLA, NeurIPS 22')☆16Updated 2 years ago
- ☆47Updated 6 months ago
- code for the ddp tutorial☆32Updated 2 years ago
- Scalable Computation of Hessian Diagonals☆12Updated 7 months ago
- HomebrewNLP in JAX flavour for maintable TPU-Training☆47Updated 11 months ago
- Code for testing DCT plus Sparse (DCTpS) networks☆14Updated 3 years ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆57Updated last year
- ☆19Updated 2 years ago
- Jax implementation of "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆13Updated 8 months ago
- Code base for MomentumRNN.☆18Updated 4 years ago
- Local Attention - Flax module for Jax☆20Updated 3 years ago
- Implémentation of the article **Deep Learning CUDA Memory Usage and Pytorch optimization tricks**☆43Updated 5 years ago
- Sequence Modeling with Structured State Spaces☆61Updated 2 years ago
- Official code for the paper "Context-Aware Language Modeling for Goal-Oriented Dialogue Systems"☆34Updated 2 years ago
- The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…☆66Updated 2 years ago
- Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper☆80Updated 3 years ago
- Blog post☆16Updated 11 months ago
- A study on the following problems: what the memorization problem is in meta-learning; why memorization problem happens; and how we can pr…☆21Updated last year
- ☆31Updated last month
- ☆11Updated last year
- A package for fine tuning of pretrained NLP transformers using Semi Supervised Learning☆15Updated 3 years ago
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆32Updated 3 years ago
- Explores the ideas presented in Deep Ensembles: A Loss Landscape Perspective (https://arxiv.org/abs/1912.02757) by Stanislav Fort, Huiyi …☆62Updated 4 years ago
- ☆31Updated 9 months ago
- An implementation of Transformer with Expire-Span, a circuit for learning which memories to retain☆33Updated 4 years ago
- GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification☆32Updated 3 years ago
- ☆21Updated 3 years ago
- Experiments on GPT-3's ability to fit numerical models in-context.☆14Updated 2 years ago