ViTAE-Transformer/QFormer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ViTAE-Transformer/QFormer)

ViTAE-Transformer / QFormer

The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"

☆238

Alternatives and similar repositories for QFormer

Users that are interested in QFormer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ViTAE-Transformer / SAMText
View on GitHub
The official repo for the technical report "Scalable Mask Annotation for Video Text Spotting"
☆16May 3, 2023Updated 3 years ago
ViTAE-Transformer / ViTAE-VSA
View on GitHub
The official repo for [ECCV'22] "VSA: Learning Varied-Size Window Attention in Vision Transformers"
☆159Sep 25, 2025Updated 9 months ago
MiliLab / LogicOCR
View on GitHub
[arXiv: 2505.12307] LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images?
☆35Dec 1, 2025Updated 7 months ago
MiliLab / S5
View on GitHub
Official repo for [AAAI 2026 Oral] "S5: Scalable Semi-Supervised Semantic Segmentation in Remote Sensing"
☆36Dec 4, 2025Updated 7 months ago
Frenkie14 / Agrifood-Survey
View on GitHub
The official repo for [ACM CSUR'24] "Empowering Agrifood System with Artificial Intelligence: A Survey of the Progress, Challenges and Op…
☆12Dec 6, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Hxyz-123 / ReasoningOCR
View on GitHub
☆18Jul 24, 2025Updated 11 months ago
MiliLab / AnesSuite
View on GitHub
Official repo for [ICLR 2026] "AnesSuite: A Comprehensive Benchmark and Dataset Suite for Anesthesiology Reasoning in LLMs"
☆25Feb 28, 2026Updated 4 months ago
btma48 / AutoLA
View on GitHub
Code of our Neurips2020 paper "Auto Learning Attention", coming soon
☆22Apr 14, 2021Updated 5 years ago
ViTAE-Transformer / ViTAE-Transformer-Matting
View on GitHub
A comprehensive list [AIM@IJCAI'21, P3M@MM'21, GFM@IJCV'22, RIM@CVPR'23, P3MNet@IJCV'23] of our research works related to image matting, …
☆231Apr 11, 2023Updated 3 years ago
MiliLab / Text-Before-Vision
View on GitHub
[ICML 2026] Text Before Vision: Staged Knowledge Injection Matters for Agentic RLVR in Ultra-High-Resolution Remote Sensing Understanding
☆16Mar 13, 2026Updated 4 months ago
ecoxial2007 / DCG_Enhanced_distilGPT2
View on GitHub
This repository contains the implementation of the method described in our paper, "Divide and Conquer: Isolating Normal-Abnormal Attribut…
☆11Apr 9, 2024Updated 2 years ago
sunsmarterjie / iTPN
View on GitHub
(CVPR2023/TPAMI2024) Integrally Pre-Trained Transformer Pyramid Networks -- A Hierarchical Vision Transformer for Masked Image Modeling
☆215Jul 28, 2024Updated last year
MiliLab / GeoBridge
View on GitHub
Official repo for [CVPR 2026] "GeoBridge: A Semantic-Anchored Multi-View Foundation Model Bridging Images and Text for Geo-Localization"
☆38May 14, 2026Updated 2 months ago
DREAMXFAR / FCL-Net
View on GitHub
This is the pytorch implementation of FCL-Net, accepted by NN'2022.
☆15May 25, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ViTAE-Transformer / ViTDet
View on GitHub
Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object Detection"
☆586Apr 24, 2022Updated 4 years ago
hiker-lw / RealRain-1k
View on GitHub
Official repository for RealRain-1k
☆33Jul 6, 2025Updated last year
MiliLab / GeoZero
View on GitHub
Official repo for "GeoZero: Incentivizing Reasoning from Scratch on Geospatial Scenes"
☆27Feb 11, 2026Updated 5 months ago
LeapLabTHU / DAT
View on GitHub
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Atte…
☆939Apr 17, 2024Updated 2 years ago
ming053l / S3RNet
View on GitHub
[JSTARS'26] S3RNet: Sparse Spatial--Spectral Representation with Hybrid Knowledge Distillation for Efficient Multispectral and Hyperspect…
☆14Jun 16, 2026Updated last month
PengtaoJiang / TSP6K
View on GitHub
The official PyTorch code for "Traffic Scene Parsing through the TSP6K Dataset".
☆34Jul 6, 2025Updated last year
Annbless / ViTAE
View on GitHub
The official pytorch implementation of ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
☆104Apr 12, 2022Updated 4 years ago
ViTAE-Transformer / ViTAE-Transformer-Remote-Sensing
View on GitHub
A comprehensive list [SAMRS@NeurIPS'23, RVSA@TGRS'22, RSP@TGRS'22] of our research works related to remote sensing, including papers, cod…
☆485Jun 6, 2024Updated 2 years ago
JizhiziLi / RIM
View on GitHub
[CVPR 2023] Referring Image Matting
☆207Apr 17, 2023Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
CoderChen01 / InterCLIP-MEP
View on GitHub
Official repository of the paper "InterCLIP-MEP: Interactive CLIP and Memory-Enhanced Predictor for Multi-modal Sarcasm Detection"
☆16Nov 13, 2025Updated 8 months ago
cschenxiang / UAV-Rain1k
View on GitHub
UAV-Rain1k: A Benchmark for Raindrop Removal from UAV Aerial Imagery (CVPRW 2024)
☆35Apr 13, 2024Updated 2 years ago
MiliLab / REX-RAG
View on GitHub
Official repo for "REX-RAG: Reasoning Exploration with Policy Correction in Retrieval-Augmented Generation"
☆35Sep 28, 2025Updated 9 months ago
LeapLabTHU / Agent-Attention
View on GitHub
[ECCV 2024] Official repository of Agent Attention
☆668Nov 17, 2024Updated last year
phymhan / S2D2
View on GitHub
☆16Jun 17, 2026Updated last month
hkzhang-git / FcaFormer
View on GitHub
[ICCV 2023] Source code of "Fcaformer: Forward Cross Attention in Hybrid Vision Transformer"
☆25Aug 23, 2023Updated 2 years ago
DengPingFan / SOC-DataAug
View on GitHub
Salient Objects in Clutter, arXiv, 2021 (ECCV2018 extenstion).
☆11Jun 17, 2021Updated 5 years ago
Hxyz-123 / GoMatching
View on GitHub
[NeurIPS'24] GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching
☆33May 29, 2025Updated last year
Asunatan / LM-Net
View on GitHub
A Light-weight and Multi-scale Network for Medical Image Segmentation
☆30Jun 8, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
sunnyHelen / JPerceiver
View on GitHub
[ECCV 2022]JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes
☆79Nov 4, 2022Updated 3 years ago
dongbo811 / AFFormer
View on GitHub
☆136Jan 19, 2023Updated 3 years ago
scofield7419 / MUIE-REAMO
View on GitHub
Code of the Grounded MUIE model, REAMO
☆11Dec 3, 2024Updated last year
NVlabs / TokenBench
View on GitHub
A Video Tokenizer Evaluation Dataset
☆157Jan 13, 2025Updated last year
WHU-ZQH / FSAM4PLM
View on GitHub
[EMNLP22] Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models
☆22Mar 27, 2023Updated 3 years ago
hammoudiproject / SuperpixelGridMasks
View on GitHub
SuperpixelGridMasks is an approach for sensor-based data augmentation towards image classification tasks and so on.
☆14Jan 18, 2023Updated 3 years ago
ziplab / HVT
View on GitHub
[ICCV 2021] Official implementation of "Scalable Vision Transformers with Hierarchical Pooling"
☆32Dec 30, 2021Updated 4 years ago