DACUS1995 / pytorch-mmap-datasetLinks
A custom pytorch Dataset extension that provides a faster iteration and better RAM usage
☆46Updated last year
Alternatives and similar repositories for pytorch-mmap-dataset
Users that are interested in pytorch-mmap-dataset are comparing it to the libraries listed below
Sorting:
- several types of attention modules written in PyTorch for learning purposes☆52Updated last year
- Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with Sparse Transformers"☆92Updated last month
- [ICLR 2022] Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention☆196Updated 3 years ago
- The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…☆70Updated 4 years ago
- ☆34Updated 5 months ago
- Code repository of the paper "Modelling Long Range Dependencies in ND: From Task-Specific to a General Purpose CNN" https://arxiv.org/abs…☆183Updated 7 months ago
- Pytorch cyclic cosine decay learning rate scheduler☆49Updated 4 years ago
- A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.☆131Updated last year
- [NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"☆74Updated 3 years ago
- State Space Models☆71Updated last year
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"☆386Updated 2 years ago
- PyTorch, PyTorch Lightning framework for trying knowledge distillation in image classification problems☆32Updated last year
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆79Updated 2 years ago
- Timm model explorer☆42Updated last year
- Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption☆109Updated 2 years ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆219Updated 2 years ago
- PyTorch implementation of moe, which stands for mixture of experts☆51Updated 4 years ago
- A repository for DenseSSMs☆89Updated last year
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆64Updated 2 years ago
- A Tight-fisted Optimizer☆50Updated 2 years ago
- Visualizer for PyTorch image models☆44Updated 4 years ago
- ☆17Updated 2 years ago
- ☆95Updated 3 years ago
- Unofficial PyTorch implementation of Google's FNet: Mixing Tokens with Fourier Transforms. With checkpoints.☆77Updated 3 years ago
- ☆293Updated 11 months ago
- [EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling☆87Updated 2 years ago
- ☆187Updated last year
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆101Updated last year
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Updated 3 years ago
- Transformers w/o Attention, based fully on MLPs☆95Updated last year