The-AI-Summer / pytorch-ddpLinks

code for the ddp tutorial

☆32

Alternatives and similar repositories for pytorch-ddp

Users that are interested in pytorch-ddp are comparing it to the libraries listed below

Sorting:

lucidrains / tableformer-pytorch
Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch
☆39Updated 3 years ago
fkodom / soft-mixture-of-experts
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)
☆75Updated last year
fattorib / Flax-ResNets
CIFAR10 ResNets implemented in JAX+Flax
☆12Updated 3 years ago
sarthmit / Mod_Arch
☆33Updated last year
LAION-AI / Conditional-Pretraining-of-Large-Language-Models
☆37Updated 2 years ago
ahennequ / pytorch-custom-mma
☆29Updated 2 years ago
YeonwooSung / Pytorch_mixture-of-experts
PyTorch implementation of moe, which stands for mixture of experts
☆45Updated 4 years ago
CyndxAI / QKNorm
Code for the paper "Query-Key Normalization for Transformers"
☆43Updated 4 years ago
vrvlive / knowlege-distillation
PyTorch, PyTorch Lightning framework for trying knowledge distillation in image classification problems
☆32Updated 11 months ago
pkuzengqi / Skyformer
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)
☆62Updated 3 years ago
bwconrad / soft-moe
PyTorch implementation of "From Sparse to Soft Mixtures of Experts"
☆58Updated last year
microsoft / AutoMoE
AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
☆47Updated 2 years ago
decurtoydiaz / learning_with_signatures
Learning with Signatures
☆56Updated 3 years ago
schwartz-lab-NLP / papa
Code for the PAPA paper
☆27Updated 2 years ago
rwightman / imagenet-12k
ImageNet-12k subset of ImageNet-21k (fall11)
☆21Updated 2 years ago
rasbt / cvpr2023
☆133Updated last year
aniquetahir / JORA
JORA: JAX Tensor-Parallel LoRA Library (ACL 2024)
☆34Updated last year
microsoft / Stochastic-Mixture-of-Experts
This package implements THOR: Transformer with Stochastic Experts.
☆65Updated 3 years ago
amirzandieh / HyperAttention
Triton Implementation of HyperAttention Algorithm
☆48Updated last year
lucidrains / rela-transformer
Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012
☆49Updated 3 years ago
leaderj1001 / Synthesizer-Rethinking-Self-Attention-Transformer-Models
Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch
☆70Updated 5 years ago
conceptofmind / t5-pytorch
Implementation of Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer in PyTorch.
☆51Updated last year
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆111Updated 6 months ago
jiaweizzhao / ZerO-initialization
☆74Updated 2 years ago
knotgrass / attention
several types of attention modules written in PyTorch for learning purposes
☆54Updated 9 months ago
MAC-AutoML / YOCO-BERT
The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Natu…
☆48Updated 4 years ago
sayakpaul / robustness-foundation-models
This repository holds code and other relevant files for the NeurIPS 2022 tutorial: Foundational Robustness of Foundation Models.
☆70Updated 2 years ago
lucidrains / deep-linear-network
A simple implementation of a deep linear Pytorch module
☆21Updated 4 years ago
VITA-Group / Random-MoE-as-Dropout
[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…
☆52Updated 2 years ago
JeanKaddour / LAWA
Latest Weight Averaging (NeurIPS HITY 2022)
☆30Updated 2 years ago