Fast Discounted Cumulative Sums in PyTorch
☆98Aug 28, 2021Updated 4 years ago
Alternatives and similar repositories for torch-discounted-cumsum
Users that are interested in torch-discounted-cumsum are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- ☆10Sep 13, 2021Updated 4 years ago
- Combining SOAP and MUON☆20Feb 11, 2025Updated last year
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆48Nov 30, 2021Updated 4 years ago
- ESGD-M is a stochastic non-convex second order optimizer, suitable for training deep learning models, for PyTorch.☆57Sep 18, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆81Jan 21, 2022Updated 4 years ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- Contrastive Language-Image Pretraining☆146Sep 6, 2022Updated 3 years ago
- ☆21Mar 15, 2023Updated 3 years ago
- Replication attempt for the Protein Folding Model described in https://www.biorxiv.org/content/10.1101/2021.08.02.454840v1☆37May 19, 2022Updated 3 years ago
- CLASP - Contrastive Language-Aminoacid Sequence Pretraining☆142Sep 17, 2021Updated 4 years ago
- Efficient Householder Transformation in PyTorch☆69Jul 6, 2021Updated 4 years ago
- Spectral Tensor Train Parameterization of Deep Learning Layers☆17Jul 1, 2021Updated 4 years ago
- lanmt ebm☆12Jun 19, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Hidden Engrams: Long Term Memory for Transformer Model Inference☆35Jun 26, 2021Updated 4 years ago
- Implementation of E(n)-Transformer, which incorporates attention mechanisms into Welling's E(n)-Equivariant Graph Neural Network☆226Jun 2, 2024Updated last year
- Very deep VAEs in JAX/Flax☆46Jun 16, 2021Updated 4 years ago
- Code publication to the paper "Normalized Attention Without Probability Cage"☆17Nov 9, 2021Updated 4 years ago
- A better PyTorch implementation of image local attention which reduces the GPU memory by an order of magnitude.☆141Dec 21, 2021Updated 4 years ago
- Accelerated First Order Parallel Associative Scan☆197Jan 7, 2026Updated 3 months ago
- Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, …☆39Aug 3, 2021Updated 4 years ago
- ☆26May 9, 2022Updated 3 years ago
- ☆30Nov 25, 2021Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for the paper PermuteFormer☆42Oct 10, 2021Updated 4 years ago
- Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload☆132Aug 6, 2022Updated 3 years ago
- A denoising diffusion probabilistic model (DDPM) tailored for conditional generation of protein distograms☆144Apr 20, 2022Updated 3 years ago
- MNIST, but with Bezier curves instead of pixels☆15Oct 29, 2021Updated 4 years ago
- ☆29Jul 9, 2024Updated last year
- A Pytree Module system for Deep Learning in JAX☆212Feb 26, 2023Updated 3 years ago
- PyTorch optimizers with sparse momentum and weight decay☆10Oct 3, 2020Updated 5 years ago
- High-throughput simulation platform for Artificial Intelligence reseach☆227Dec 1, 2022Updated 3 years ago
- Neural Algorithmic Reasoning Tutorial☆12Dec 21, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A2C is a special case of PPO!☆22May 20, 2022Updated 3 years ago
- ☆21Mar 14, 2021Updated 5 years ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆102Feb 25, 2023Updated 3 years ago
- An implementation of (Induced) Set Attention Block, from the Set Transformers paper☆67Jan 10, 2023Updated 3 years ago
- Contrastive Language-Audio Pretraining☆88Mar 6, 2022Updated 4 years ago
- ☆14Aug 18, 2022Updated 3 years ago
- Pedagogical codebase for a simplified score-based generative model design, with training loop☆40Aug 28, 2021Updated 4 years ago