PrincetonUniversity / multi_gpu_training
☆326Updated 2 months ago
Alternatives and similar repositories for multi_gpu_training
Users that are interested in multi_gpu_training are comparing it to the libraries listed below
Sorting:
- Example of how to use Weights & Biases on Slurm☆114Updated 2 years ago
- Annotated version of the Mamba paper☆483Updated last year
- TensorDict is a pytorch dedicated tensor container.☆925Updated this week
- Universal Tensor Operations in Einstein-Inspired Notation for Python.☆369Updated last month
- Helpful tools and examples for working with flex-attention☆766Updated last week
- A convenient way to trigger synchronizations to wandb / Weights & Biases if your compute nodes don't have internet!☆77Updated last week
- Python 3.8+ toolbox for submitting jobs to Slurm☆1,426Updated 2 weeks ago
- Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory☆437Updated 8 months ago
- [ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"☆321Updated 4 months ago
- ☆50Updated 11 months ago
- Implementation of https://srush.github.io/annotated-s4☆494Updated 2 years ago
- Named tensors with first-class dimensions for PyTorch☆320Updated last year
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"☆378Updated last year
- Implementation of Diffusion Transformer (DiT) in JAX☆275Updated 11 months ago
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆514Updated this week
- ☆186Updated 3 months ago
- A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to fac…☆229Updated 4 months ago
- ☆226Updated 3 months ago
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆289Updated last month
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores☆317Updated 4 months ago
- FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.☆207Updated last year
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch☆673Updated 5 months ago
- TorchOpt is an efficient library for differentiable optimization built upon PyTorch.☆588Updated last week
- Code for our NeurIPS 2022 paper☆368Updated 2 years ago
- Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs☆269Updated 3 weeks ago
- ☆155Updated last year
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch☆331Updated 11 months ago
- Puzzles for exploring transformers☆348Updated 2 years ago
- A simple library for scaling up JAX programs☆134Updated 6 months ago
- Tensors, for human consumption☆1,251Updated 5 months ago