Trains Transformer model variants. Data isn't shuffled between batches.
☆146Oct 5, 2022Updated 3 years ago
Alternatives and similar repositories for transformer-sequential
Users that are interested in transformer-sequential are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Neural Ensemble Search for Uncertainty Estimation and Dataset Shift☆35Jan 10, 2026Updated 4 months ago
- Implementation of DropLoss for Long-Tail Instance Segmentation in Pytorch☆44Apr 14, 2021Updated 5 years ago
- Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).☆229Apr 18, 2022Updated 4 years ago
- My implementation of DeepMind's Perceiver☆65Apr 23, 2021Updated 5 years ago
- Expressive Power of Invariant and Equivariant Graph Neural Networks (ICLR 2021)☆42Aug 25, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- GAN models implemented with Pytorch Lightning and Hydra configuration☆33Jun 5, 2022Updated 3 years ago
- Estimating Example Difficulty using Variance of Gradients☆66Jan 10, 2023Updated 3 years ago
- Code for the CVPR 2020 [ORAL] paper "SAM: The Sensitivity of Attribution Methods to Hyperparameters"☆29Dec 8, 2022Updated 3 years ago
- Sequence modeling with Mega.☆303Jan 28, 2023Updated 3 years ago
- Collection of machine learning research paper references☆25Feb 23, 2025Updated last year
- Lightweight Cluster/Cloud VM Job Management 🚀☆44Aug 27, 2024Updated last year
- ☆100Dec 8, 2021Updated 4 years ago
- An efficient implementation of the popular sequence models for text generation, summarization, and translation tasks. https://arxiv.org/p…☆435Aug 17, 2022Updated 3 years ago
- Euclidean Wasserstein-2 optimal transportation☆48Aug 19, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- lanmt ebm☆12Jun 19, 2020Updated 5 years ago
- ☆391Oct 18, 2023Updated 2 years ago
- Official code Cross-Covariance Image Transformer (XCiT)☆679Sep 28, 2021Updated 4 years ago
- Course notes and notebooks to teach the fundamentals of how deep learning works; uses PyTorch.☆83Feb 16, 2021Updated 5 years ago
- Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.☆246Feb 16, 2026Updated 3 months ago
- A lightweight experimental logging library☆54Dec 23, 2025Updated 4 months ago
- Code for paper "Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning".☆14May 23, 2021Updated 4 years ago
- (K3IM) Keras 3 Image Models☆22Feb 22, 2024Updated 2 years ago
- ☆30Jan 17, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Source code and data for the paper "Towards String-to-Tree Neural Machine Translation"☆16Dec 31, 2017Updated 8 years ago
- ☆114Aug 6, 2024Updated last year
- http://nlp.seas.harvard.edu/2018/04/03/attention.html☆63May 20, 2021Updated 5 years ago
- Implementation of Feedback Transformer in Pytorch☆108Mar 2, 2021Updated 5 years ago
- A Python library for mathematical optimization☆143Sep 27, 2024Updated last year
- GPT, but made only out of MLPs☆89May 25, 2021Updated 4 years ago
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…☆359Feb 22, 2022Updated 4 years ago
- ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhi…☆49Apr 26, 2021Updated 5 years ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Dec 5, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.☆147Jul 26, 2021Updated 4 years ago
- My assignment solutions for Stanford’s CS231n (CNNs for Visual Recognition) and Michigan’s EECS 498-007/598-005 (Deep Learning for Comput…☆135Apr 4, 2021Updated 5 years ago
- Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering☆174Jun 6, 2021Updated 4 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- Neural Text Generation with Unlikelihood Training☆311Aug 31, 2021Updated 4 years ago
- 日本語CLIPモデル☆13Sep 15, 2025Updated 8 months ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆126Nov 13, 2020Updated 5 years ago