[ICCV2023] This is an official implementation for "Scale-Aware Modulation Meet Transformer".
☆214Aug 1, 2023Updated 2 years ago
Alternatives and similar repositories for SMT
Users that are interested in SMT are comparing it to the libraries listed below
Sorting:
- [MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking f…☆20Dec 4, 2024Updated last year
- [CVPR 2023] Official code release of our paper "BiFormer: Vision Transformer with Bi-Level Routing Attention"☆575May 22, 2023Updated 2 years ago
- [ICCV2023] DETR Doesn’t Need Multi-Scale or Locality Design☆227Nov 14, 2023Updated 2 years ago
- ☆22May 30, 2023Updated 2 years ago
- [ACM MM 2022] Marior: Margin Removal and Iterative Content Rectification for Document Dewarping in the Wild☆25Aug 12, 2022Updated 3 years ago
- ☆85Aug 30, 2023Updated 2 years ago
- The official code of "Rethinking Local Perception in Lightweight Vision Transformer"☆91May 11, 2023Updated 2 years ago
- (CVPR2024)RMT: Retentive Networks Meet Vision Transformer☆384Jul 29, 2024Updated last year
- [ICCV 2023] Official PyTorch implementation of "Rethinking Mobile Block for Efficient Attention-based Models"☆254Oct 24, 2023Updated 2 years ago
- [CVPR 2023] Official implementation of the paper "Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR"☆208Jun 1, 2023Updated 2 years ago
- [ICCV'23] Cascade-DETR: Delving into High-Quality Universal Object Detection☆99Sep 12, 2023Updated 2 years ago
- [NeurIPS 2022] HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions☆346Dec 30, 2025Updated 2 months ago
- The official implement of CTRNet++.☆14Dec 30, 2024Updated last year
- ☆149Jun 25, 2024Updated last year
- [ICCV 2023] Official repository of FLatten Transformer☆446Nov 4, 2024Updated last year
- ☆127Jan 31, 2024Updated 2 years ago
- [NeurIPS2023]Lightweight Vision Transformer with Bidirectional Interaction☆27Oct 27, 2023Updated 2 years ago
- Official implementation of the CVPR 2024 paper ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense…☆342Jan 21, 2025Updated last year
- Official Code of Paper "Reversible Column Networks" "RevColv2"☆265Sep 6, 2023Updated 2 years ago
- [CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions☆2,802Mar 25, 2025Updated 11 months ago
- (ICCV 2023) ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer☆78Apr 9, 2024Updated last year
- Code release for "VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning" & "VSCode-v2: Dynamic Prompt L…☆60Dec 8, 2025Updated 3 months ago
- [ACCV 2024 ] Official code for "DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention"☆32Jan 8, 2025Updated last year
- [AAAI2025 Oral] Predicting the Original Appearance of Damaged Historical Documents☆107Jul 15, 2025Updated 8 months ago
- [ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions☆1,476Jun 3, 2025Updated 9 months ago
- ☆31Dec 18, 2025Updated 3 months ago
- ☆159Mar 13, 2024Updated 2 years ago
- ☆157May 8, 2025Updated 10 months ago
- A simple minimal implementation of Reversible Vision Transformers☆128Mar 14, 2024Updated 2 years ago
- [IEEE TPAMI'23] Pyramid Pooling Transformer for Scene Understanding☆220Jun 16, 2025Updated 9 months ago
- ☆214Dec 17, 2021Updated 4 years ago
- [ICCV - 2023] Official repository of paper SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applic…☆314Jul 18, 2025Updated 8 months ago
- [ICCV 2023] Code base for Revisiting Scene Text Recognition: A Data Perspective☆202Nov 1, 2023Updated 2 years ago
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆84Nov 2, 2022Updated 3 years ago
- Official code for paper "On the Connection between Local Attention and Dynamic Depth-wise Convolution" ICLR 2022 Spotlight☆185Nov 17, 2022Updated 3 years ago
- The code of paper Efficient Camouflaged Object Detection Network Based on Global Localization Perception and Local Guidance Refinement☆27Nov 26, 2024Updated last year
- VMamba: Visual State Space Models,code is based on mamba☆3,079Mar 7, 2025Updated last year
- MetaFormer Baselines for Vision (TPAMI 2024)☆495Jun 1, 2024Updated last year
- iFormer: Inception Transformer☆248Jan 14, 2023Updated 3 years ago