DACUS1995 / pytorch-mmap-datasetLinks

A custom pytorch Dataset extension that provides a faster iteration and better RAM usage

☆46

Alternatives and similar repositories for pytorch-mmap-dataset

Users that are interested in pytorch-mmap-dataset are comparing it to the libraries listed below

Sorting:

ag1988 / top_k_attention
The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…
☆69Updated 4 years ago
knotgrass / attention
several types of attention modules written in PyTorch for learning purposes
☆52Updated last year
HKUNLP / efficient-attention
[EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling
☆87Updated 2 years ago
ziplab / EcoFormer
[NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"
☆74Updated 3 years ago
LukasHedegaard / pytorch-benchmark
Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption
☆109Updated 2 years ago
WailordHe / DenseSSM
A repository for DenseSSMs
☆89Updated last year
vra / flopth
A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.
☆131Updated 11 months ago
kyegomez / Blockwise-Parallel-Transformer
32 times longer context window than vanilla Transformers and up to 4 times longer than memory efficient Transformers.
☆48Updated 2 years ago
kyegomez / SparseAttention
Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with Sparse Transformers"
☆92Updated 3 weeks ago
VITA-Group / AsViT
[ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…
☆76Updated 3 years ago
fkodom / soft-mixture-of-experts
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)
☆78Updated 2 years ago
lucidrains / flash-cosine-sim-attention
Implementation of fused cosine similarity attention in the same style as Flash Attention
☆216Updated 2 years ago
krafton-ai / mambaformer-icl
MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248
☆57Updated last year
kssteven418 / LTP
[KDD'22] Learned Token Pruning for Transformers
☆101Updated 2 years ago
buttercutter / Mamba_SSM
A simple implementation of [Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752)
☆22Updated last year
bwconrad / soft-moe
PyTorch implementation of "From Sparse to Soft Mixtures of Experts"
☆66Updated 2 years ago
yuzhenmao / IceFormer
Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).
☆25Updated 4 months ago
NVlabs / EfficientDL
☆34Updated 5 months ago
OpenNLPLab / Transnormer
[EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer
☆63Updated 2 years ago
abhuse / cyclic-cosine-decay
Pytorch cyclic cosine decay learning rate scheduler
☆49Updated 4 years ago
vrvlive / knowlege-distillation
PyTorch, PyTorch Lightning framework for trying knowledge distillation in image classification problems
☆32Updated last year
facebookresearch / adaptive_scheduling
Experimental scripts for researching data adaptive learning rate scheduling.
☆22Updated 2 years ago
davidpicard / deepseed
☆29Updated 4 years ago
lucidrains / autoregressive-linear-attention-cuda
CUDA implementation of autoregressive linear attention, with all the latest research findings
☆45Updated 2 years ago
BBuf / flash-rwkv
☆32Updated last year
LAION-AI / Conditional-Pretraining-of-Large-Language-Models
☆37Updated 2 years ago
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆113Updated 10 months ago
badripatro / mamba360
State Space Models
☆71Updated last year
OpenNLPLab / cosFormer
[ICLR 2022] Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention
☆196Updated 2 years ago
bojone / tiger
A Tight-fisted Optimizer
☆50Updated 2 years ago