BorealisAI / efficient-vit-training
PyTorch code of "Training a Vision Transformer from scratch in less than 24 hours with 1 GPU" (HiTY workshop at Neurips 2022)
☆20Updated last year
Alternatives and similar repositories for efficient-vit-training:
Users that are interested in efficient-vit-training are comparing it to the libraries listed below
- an implementation of FAdam (Fisher Adam) in PyTorch☆43Updated 10 months ago
- Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch☆33Updated 4 months ago
- Implementation of the proposed DeepCrossAttention by Heddes et al at Google research, in Pytorch☆81Updated last month
- Official PyTorch implementation of "LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging" (ICML'24)☆29Updated 7 months ago
- An efficient pytorch implementation of selective scan in one file, works with both cpu and gpu, with corresponding mathematical derivatio…☆81Updated last year
- Implementation of a Light Recurrent Unit in Pytorch☆47Updated 6 months ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆34Updated 9 months ago
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆33Updated 4 months ago
- ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2☆63Updated 4 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆86Updated this week
- Official code repository for paper: "ExPLoRA: Parameter-Efficient Extended Pre-training to Adapt Vision Transformers under Domain Shifts"☆31Updated 6 months ago
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆24Updated this week
- The official implementation of the paper "Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation"☆20Updated 4 months ago
- [ECCV 2024] Isomorphic Pruning for Vision Models☆66Updated 8 months ago
- The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"☆28Updated this week
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆71Updated last year
- Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with…☆28Updated this week
- Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch☆23Updated this week
- Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"☆50Updated 2 months ago
- The official project website of "KernelWarehouse: Rethinking the Design of Dynamic Convolution" (KW for short, published in ICML 2024)☆97Updated 9 months ago
- This repository contains the pytorch code for our work IEEE ISBI 2024 paper "ConvLoRA and AdaBN Based Domain Adaptation via Self-Training…☆70Updated 5 months ago
- ☆48Updated last year
- [ICCV2023] TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance☆89Updated 8 months ago
- Official Pytorch Implementation of Self-emerging Token Labeling☆32Updated last year
- Integration of Swin Transformer to DETR for Robust Object Detection (DEMO)☆28Updated 3 years ago
- Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public☆82Updated last month
- Official Pytorch implementation for "IFORMER: INTEGRATING CONVNET AND TRANSFORMER FOR MOBILE APPLICATION" [ICLR 2025]☆40Updated 2 weeks ago
- ☆24Updated 6 months ago
- ☆13Updated 6 months ago
- OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆58Updated last month