iCVTEAM / M3TRLinks
M3TR: Multi-modal Multi-label Recognition with Transformer. ACM MM 2021
☆15Updated 3 years ago
Alternatives and similar repositories for M3TR
Users that are interested in M3TR are comparing it to the libraries listed below
Sorting:
- PyTorch Implementation of Deep Equilibrium Multimodal Fusion☆21Updated last year
- The official implementation for ALOFT (CVPR 2023).☆55Updated last year
- [ACMMM 2020] Code release for "Learning Deep Multimodal Feature Representation with Asymmetric Multi-layer Fusion"☆28Updated 3 years ago
- ☆26Updated 2 years ago
- 2021 AAAI Modular Graph Transformer Networks for Multi-Label Image Classification; Official GitHub: https://github.com/ReML-AI/MGTN☆21Updated 3 years ago
- ☆152Updated last year
- Implementation of vision transformer. ⭐⭐⭐☆33Updated 3 years ago
- [ECCV 2022] LAFF for Text-to-Video Retrieval☆45Updated last year
- ☆10Updated 3 years ago
- The official repository of the paper "Learning Correlation Structures for Vision Transformers" accepted to CVPR 2024.☆48Updated last year
- This repo shows the source code of IEEE TGRS 2022 article: Sonar Images Classification While Facing Long-Tail and Few-Shot.☆15Updated last year
- Scattering Vision Transformer☆53Updated last year
- [PR 2022, Highly Cited Paper] Learning Attention-Guided Pyramidal Features for Few-shot Fine-grained Recognition☆17Updated 2 years ago
- ☆21Updated last year
- ☆143Updated last year
- This repo includes the CUB-GHA (Gaze-based Human Attention) dataset and code of the paper "Human Attention in Fine-grained Classification…☆31Updated 3 years ago
- Official code release of our paper "EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention"☆21Updated 9 months ago
- ☆147Updated 10 months ago
- ☆85Updated last year
- PyTorch implementation of Deep Semantic Dictionary Learning for Multi-label Image Classification, AAAI 2021.☆49Updated 3 years ago
- ☆65Updated last year
- A PyTorch implementation of CMT based on paper CMT: Convolutional Neural Networks Meet Vision Transformers.☆71Updated 2 years ago
- ☆35Updated 3 years ago
- How Much Position Information Do Convolutional Neural Networks Encode?☆11Updated 3 years ago
- ReViT - Residual Attention Vision Transformer☆32Updated last year
- Complementing Representation Deficiency in Few-shot Image Classification: A Meta-Learning Approach☆8Updated 4 years ago
- Official PyTorch implementation of "TDAM: Top-down attention module for CNNs"☆12Updated 2 years ago
- The results and code of our IEEE TCYB 2022 paper, titled "Global-and-Local Collaborative Learning for Co-Salient Object Detection"☆12Updated 3 years ago
- [CVPR' 23] Adjustment and Alignment for Unbiased Open Set Domain Adaptation☆19Updated 2 years ago
- Source code for AAAI 2025 paper: FSTA-SNN:Frequency-based Spatial-Temporal Attention Module for Spiking Neural Networks☆28Updated 5 months ago