BobMcDear / simsiam-pytorch
PyTorch implementation of SimSiam
☆8Updated last year
Related projects: ⓘ
- ☆9Updated 8 months ago
- Layerwise Batch Entropy Regularization☆22Updated 2 years ago
- Blog post☆16Updated 7 months ago
- Jax implementation of "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆12Updated 4 months ago
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆60Updated 4 months ago
- ☆41Updated 2 months ago
- ☆30Updated 8 months ago
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆58Updated 2 years ago
- ☆19Updated last month
- code for the ddp tutorial☆31Updated 2 years ago
- Code for the PAPA paper☆27Updated last year
- an implementation of paper"Retentive Network: A Successor to Transformer for Large Language Models" https://arxiv.org/pdf/2307.08621.pdf☆12Updated last year
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆35Updated 2 years ago
- ☆35Updated 5 months ago
- Example codes in the medium post titled "Optuna meets Weights and Biases."☆21Updated 2 years ago
- ☆42Updated 7 months ago
- An implementation of Transformer with Expire-Span, a circuit for learning which memories to retain☆33Updated 3 years ago
- ☆30Updated 3 months ago
- ☆27Updated this week
- Using FlexAttention to compute attention with different masking patterns☆28Updated last week
- ☆24Updated 2 months ago
- ☆41Updated 6 years ago
- An adaptive training algorithm for residual network☆14Updated 4 years ago
- PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Repr…☆21Updated 2 years ago
- ☆25Updated 5 months ago
- MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space☆40Updated 3 years ago
- Efficient PScan implementation in PyTorch☆15Updated 8 months ago
- ☆28Updated last week
- This project attempts to build neural network training and lightweighting cookbook including three kinds of lightweighting solutions, i.e…☆23Updated 2 years ago
- Sequence Modeling with Structured State Spaces☆60Updated 2 years ago