Nested Hierarchical Transformer https://arxiv.org/pdf/2105.12723.pdf
☆201Jul 30, 2024Updated last year
Alternatives and similar repositories for nested-transformer
Users that are interested in nested-transformer are comparing it to the libraries listed below
Sorting:
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- [ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…☆76Feb 21, 2022Updated 4 years ago
- ☆17Nov 4, 2022Updated 3 years ago
- ☆249Mar 16, 2022Updated 3 years ago
- Escaping the Big Data Paradigm with Compact Transformers, 2021 (Train your Vision Transformers in 30 mins on CIFAR-10 with a single GPU!)☆537Nov 5, 2024Updated last year
- ☆246Jul 23, 2021Updated 4 years ago
- [CVPR 2022] Official code for "Unified Contrastive Learning in Image-Text-Label Space"☆407Nov 10, 2023Updated 2 years ago
- Official code Cross-Covariance Image Transformer (XCiT)☆674Sep 28, 2021Updated 4 years ago
- Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)☆222Aug 23, 2022Updated 3 years ago
- This is an official implementation for "ResT: An Efficient Transformer for Visual Recognition".☆292Sep 28, 2022Updated 3 years ago
- [NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification☆651Jul 11, 2023Updated 2 years ago
- A curated resources on what's happening in multimodal learning. Features recent papers, books, related lectures, and other relevant resou…☆16Apr 28, 2023Updated 2 years ago
- Official DeiT repository☆4,325Mar 15, 2024Updated last year
- Official repository for the "Big Transfer (BiT): General Visual Representation Learning" paper.☆1,539Jul 30, 2024Updated last year
- L-Verse: Bidirectional Generation Between Image and Text☆107Apr 1, 2025Updated 11 months ago
- An end-to-end PyTorch framework for image and video classification☆1,613Jun 27, 2024Updated last year
- Code for the ECCV 2022 paper "Unleashing Transformers"☆185Apr 17, 2023Updated 2 years ago
- PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)☆99May 2, 2022Updated 3 years ago
- ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet☆1,192Oct 27, 2023Updated 2 years ago
- Improving Representation Learning for Histopathologic Images with Cluster Constraints☆17Jan 20, 2024Updated 2 years ago
- Tensorflow implementation for "Improved Transformer for High-Resolution GANs" (NeurIPS 2021).☆93Jul 30, 2024Updated last year
- code release of research paper "Exploring Long-Sequence Masked Autoencoders"☆100Oct 14, 2022Updated 3 years ago
- PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)☆1,367Jun 1, 2024Updated last year
- An implementation of simple diffusion in PyTorch (and JAX)☆34Jan 28, 2023Updated 3 years ago
- ☆30Nov 21, 2019Updated 6 years ago
- This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"☆15Jun 22, 2023Updated 2 years ago
- Multimodal classification solution for the SIGIR eCOM using Co-attention and transformer language models☆19Aug 17, 2020Updated 5 years ago
- Codebase for Image Classification Research, written in PyTorch.☆2,168Mar 20, 2024Updated last year
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆80Jan 7, 2026Updated last month
- Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"☆186Oct 25, 2023Updated 2 years ago
- Editing in Style: Uncovering the Local Semantics of GANs☆15Jul 2, 2020Updated 5 years ago
- High performance pytorch modules☆17Jan 14, 2023Updated 3 years ago
- Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022☆1,174May 15, 2024Updated last year
- [ECCV 2022] Robust Object Detection With Inaccurate Bounding Boxes☆34Oct 15, 2023Updated 2 years ago
- Drop-in replacement for any ResNet with a significantly reduced memory footprint and better representation capabilities☆207Apr 24, 2024Updated last year
- ☆19Jan 27, 2021Updated 5 years ago
- Learning Features with Parameter-Free Layers, ICLR 2022☆84May 3, 2023Updated 2 years ago
- ImageNet-12k subset of ImageNet-21k (fall11)☆21Jun 13, 2023Updated 2 years ago
- VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.☆3,295Mar 3, 2024Updated 2 years ago