The codes for TCFormer in paper: Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer
☆243Aug 3, 2024Updated last year
Alternatives and similar repositories for TCFormer
Users that are interested in TCFormer are comparing it to the libraries listed below
Sorting:
- Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations☆199Sep 3, 2023Updated 2 years ago
- PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)☆1,367Jun 1, 2024Updated last year
- ☆214Dec 17, 2021Updated 4 years ago
- [ NeurIPS2021] This is an official implementation of our paper "HRFormer: High-Resolution Transformer for Dense Prediction".☆523Oct 19, 2022Updated 3 years ago
- Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022☆1,174May 15, 2024Updated last year
- Official PyTorch implementation of A-ViT: Adaptive Tokens for Efficient Vision Transformer (CVPR 2022)☆166Jul 14, 2022Updated 3 years ago
- Official implementation of PVT series☆1,887Oct 27, 2022Updated 3 years ago
- ☆57Oct 17, 2021Updated 4 years ago
- Directed masked autoencoders☆14Feb 20, 2026Updated last week
- [ECCV'2022 Oral] PyTorch implementation for: SimCC: a Simple Coordinate Classification Perspective for Human Pose Estimation (http://arxi…☆341Jul 17, 2022Updated 3 years ago
- Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.☆37Oct 15, 2022Updated 3 years ago
- Pytorch implementation of Mix-Shifting-MLP (MS-MLP)☆16Feb 16, 2022Updated 4 years ago
- [ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions☆1,475Jun 3, 2025Updated 9 months ago
- [AAAI 2022] This is the official PyTorch implementation of "Less is More: Pay Less Attention in Vision Transformers"☆97Jun 19, 2022Updated 3 years ago
- [NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"☆378Sep 16, 2022Updated 3 years ago
- VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation☆86Sep 12, 2024Updated last year
- [CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting☆544Sep 15, 2023Updated 2 years ago
- [NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification☆651Jul 11, 2023Updated 2 years ago
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆85Nov 2, 2022Updated 3 years ago
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆29Jan 23, 2024Updated 2 years ago
- Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Atte…☆926Apr 17, 2024Updated last year
- The official code for the paper: https://openreview.net/forum?id=_PHymLIxuI☆401Jan 14, 2024Updated 2 years ago
- vit for few-shot classification☆47Mar 24, 2023Updated 2 years ago
- A method to increase the speed and lower the memory footprint of existing vision transformers.☆1,170Jun 17, 2024Updated last year
- open source the research work for published on arxiv. https://arxiv.org/abs/2106.02689☆54Feb 14, 2022Updated 4 years ago
- ☆19Nov 25, 2022Updated 3 years ago
- [NeurIPS2022] Official implementation of the paper 'Green Hierarchical Vision Transformer for Masked Image Modeling'.☆177Jan 16, 2023Updated 3 years ago
- TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"☆36Dec 17, 2021Updated 4 years ago
- [NeurIPS 2024] official code release for our paper "Revisiting the Integration of Convolution and Attention for Vision Backbone".☆43Jan 21, 2025Updated last year
- [CVPR2022 - Oral] Official Jax Implementation of Learned Queries for Efficient Local Attention☆118Apr 19, 2022Updated 3 years ago
- The official repo for ECCV'22 paper: Pose for Everything: Towards Category-Agnostic Pose Estimation☆219May 23, 2024Updated last year
- Official code for paper "On the Connection between Local Attention and Dynamic Depth-wise Convolution" ICLR 2022 Spotlight☆185Nov 17, 2022Updated 3 years ago
- This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".☆1,024Sep 29, 2022Updated 3 years ago
- Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.☆153Jan 14, 2022Updated 4 years ago
- [ICLR2022] official implementation of UniFormer☆896Mar 29, 2024Updated last year
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆95Mar 1, 2025Updated last year
- ☆92Jan 22, 2021Updated 5 years ago
- ConvMAE: Masked Convolution Meets Masked Autoencoders☆524Mar 14, 2023Updated 2 years ago
- ☆86Feb 5, 2024Updated 2 years ago