BorealisAI / efficient-vit-trainingLinks

PyTorch code of "Training a Vision Transformer from scratch in less than 24 hours with 1 GPU" (HiTY workshop at Neurips 2022)

☆23

Alternatives and similar repositories for efficient-vit-training

Users that are interested in efficient-vit-training are comparing it to the libraries listed below

Sorting:

kyegomez / SimpleMamba
Implementation of a modular, high-performance, and simplistic mamba for high-speed applications
☆36Updated 8 months ago
lessw2020 / FAdam_PyTorch
an implementation of FAdam (Fisher Adam) in PyTorch
☆48Updated last month
kyegomez / TTL
Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"
☆25Updated 2 weeks ago
lucidrains / light-recurrent-unit-pytorch
Implementation of a Light Recurrent Unit in Pytorch
☆48Updated 10 months ago
lucidrains / deep-cross-attention
Implementation of the proposed DeepCrossAttention by Heddes et al at Google research, in Pytorch
☆90Updated 5 months ago
MzeroMiko / mamba-mini
An efficient pytorch implementation of selective scan in one file, works with both cpu and gpu, with corresponding mathematical derivatio…
☆93Updated last year
NVlabs / STL
Official Pytorch Implementation of Self-emerging Token Labeling
☆35Updated last year
kyegomez / Simba
A simpler Pytorch + Zeta Implementation of the paper: "SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series…
☆28Updated 8 months ago
alenic / timm-models-explorer
Timm model explorer
☆41Updated last year
lucidrains / hyper-connections
Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public
☆88Updated last month
lucidrains / strassen-attention
Implementation of Strassen attention, from Kozachinskiy et al. of National Center of AI in Chile
☆41Updated last month
AnonymousAlethiometer / SGD_SaI
Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"
☆52Updated 6 months ago
kyegomez / MambaTransformer
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
☆200Updated 2 weeks ago
kyegomez / MHMoE
Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch
☆27Updated last week
kyegomez / MoE-Mamba
Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…
☆109Updated last week
MambaMixer / M2
☆47Updated last year
crypdick / timm-lr-scheduler-explorer
A dashboard for exploring timm learning rate schedulers
☆19Updated 8 months ago
aleemsidra / ConvLoRA
This repository contains the pytorch code for our work IEEE ISBI 2024 paper "ConvLoRA and AdaBN Based Domain Adaptation via Self-Training…
☆80Updated 9 months ago
howard-hou / VisualRWKV
VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.
☆233Updated 2 months ago
huggingface / pixparse
Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data
☆21Updated last year
ariG23498 / mmdp
☆27Updated last month
snu-mllab / LayerMerge
Official PyTorch implementation of "LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging" (ICML 2024)
☆30Updated 11 months ago
deepglint / RWKV-CLIP
[EMNLP 2024] RWKV-CLIP: A Robust Vision-Language Representation Learner
☆140Updated 2 months ago
lucasjinreal / ImageTokenizer
imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…
☆35Updated last year
kytimmylai / NoisyNN-PyTorch
non-official NoisyNN Implemnentation
☆50Updated last year
WenjunHuang94 / ML-Mamba
ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2
☆66Updated 8 months ago
OSVAI / KernelWarehouse
The official project website of "KernelWarehouse: Rethinking the Design of Dynamic Convolution" (KW for short, published in ICML 2024)
☆100Updated last year
lucidrains / agent-attention-pytorch
Implementation of Agent Attention in Pytorch
☆91Updated last year
samar-khanna / ExPLoRA
Official code repository for ICML 2025 paper: "ExPLoRA: Parameter-Efficient Extended Pre-training to Adapt Vision Transformers under Doma…
☆38Updated 3 weeks ago
kyegomez / MambaByte
Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta
☆120Updated 2 weeks ago